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SECURE COMPUTER SYSTEM AND 
METHOD OF PROVIDING SECURE ACCESS 
TO A COMPUTER SYSTEM INCLUDING A 
STAND ALONE SWITCH OPERABLE TO 
INHIBIT DATA CORRUPTION ON A 
STORAGE DEVICE 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to computer system archi- 
tecture and more particularly to an architecture for and 
method of limiting remote access to programs and data. 

2. Description of the Related Technology 

The role of computers is rapidly changing from compu- 
tational machines to communication devices. The increasing 
use of the Internet by the general public increases the 
potential for hackers to break into sensitive computers. 
Computer hackers have successfully entered systems 
believed to be secure, gained unauthorized access, corrupted 
data, and infected systems with viruses that continue to 
cause havoc. While specialized software in the form of, for 
example, firewalls, is often provided to prevent unauthorized 
system access and to limit access so that unauthorized 
personnel cannot easily corrupt data and program files or 
otherwise cause damage to a computer system and loss of 
data, hackers are continually finding ways around the soft- 
ware. For example, viruses can be used to infect a computer 
system through infected software, causing the system to 
perform unauthorized functions and execute "rogue" code 
jeopardizing the integrity of the system. Because all func- 
tions performed by the computer system are controlled by 
instructions stored in the computer's memory, providing any 
remote access to the system provides an avenue for hackers 
to gain unauthorized access and do damage. 

A representative computer system according to the prior 
art is shown in block diagram form in FIG. 1. A prior art 
computer system 100 includes a local system bus 102 
connecting major elements of the computer system. Thus, 
local system bus 102 handles the transfer of instructions, 
data, address and control signals, etc. between the elements 
of the computer system. As shown in the figure, central 
processing unit 104 has a direct connection to bus 102 and 
to a dedicated main memory 106. Main memory 106 is 
typically a high speed, high bandwidth random access 
memory storing data and instructions. Non-volatile mass 
storage is provided by hard disk drives 110 and 112 inter- 
facing via SCSI (small computer systems interface) device 
108 to local system bus 102 and hard disk drive 122 
interfacing through IDE (intelligent drive electronics) con- 
troller 120. Central processing unit 104 also has provisions 
for displaying data to a system operator by providing 
appropriate address, data and control signals to video inter- 
face 114 whereby data is displayed on video monitor 116. 
Finally, remote access to peripheral devices and buses is 
provided by serial port 118 and Ethernet interface 124, again 
over local system bus 102. Although not shown, other 
devices providing input and output to the system may be 
included, such as a keyboard, etc., which may include a 
dedicated interface to local system bus 102 or might be 
supported by serial port 118. Similarly, other output devices 
may be included, such as a printer interfacing through serial 
port 118 or an equivalent parallel port type data connection 
(not shown). 

In operation, computer programs consisting of executable 
code and data and other information on which the code 
operates, are stored in main memory 106. Typically, this 
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includes an operating system, such as Windows NT or 
Windows 98, together with various utilities and application 
programs. At startup or initialization, central processing unit 
104 executes "boot" code, identifies system assets, such as 

5 IDE controller 120 and hard disk drive 122, and locates the 
appropriate operating system. The operating system soft- 
ware from hard disk drive 122 is then transferred through 
IDE controller 120 via local bus 102 to main memory 106. 
Central processing unit 104 then executes the operating 

30 system, transferring instructions as needed from main 
memory 106 into a "cache" or other local memory and 
registers that are a part of the central processing unit 104. 
While this is happening, dedicated hardware and firmware 
resident in video board 114 provide a visual display on video 

1S monitor 116 of system status and provide a video output for 
the operating system, utilities, and application programs. In 
addition to the online data storage provided by hard disk 
drive 122, multiple hard disk drives are supported by SCSI 
controller 108. As depicted, both hard disk drives 110 and 

20 112 are interfaced to local system bus 102 through the SCSI 
controller 108 providing additional non- volatile storage 
capabilities. 

In addition to local access to computer system 100, 
remote access is provided by serial port 118 and Ethernet 

25 card 124. For example, a modem (not shown) may be 
attached to serial port 118 to interface computer system 100 
to other media such as the public switched telephone net- 
work (PSTN), radio and fiber optic systems, etc., thereby 
providing connectivity to remote users and systems. An 

30 appropriate communications utility or application running 
on central processing unit 104 together with serial port 118 
supports exchange of data with the remote users and sys- 
tems. Similarly, Ethernet 124 is a specific embodiment of a 
network connectivity supporting, for example, a local area 

35 network (LAN), a wide area network (WAN), etc., with 
multiple remote computer systems and other resources 
attached. Using these remote access facilities, computer 
system 100 becomes accessible to authorized, and in many 
cases, unauthorized users. 

40 Although not shown, other peripherals may be included, 
such as CD-ROMS (compact disk — read only memories), 
CD-WORM (compact disk — write once read many) or 
CD -WO (compact disk — write once), CD-RW (compact 
disk — re-writeable), DVD-RAM (digital versatile disk — 

45 RAM), DVD-ROM (digital versatile disk— ROM), various 
tape drives and traditional 3Vi inch floppy disk drives. These 
devices are particularly useful for the transport of data 
between systems and backup purposes using removable 
media. Conventionally, because of access speed and storage 

50 space limitations, these devices are generally not relied upon 
as substitutes for hard disk drives which continue to be used 
as the primary media for non-volatile program and data mass 
storage. However, as computer systems have been made 
available to greater numbers of users, both locally and 

55 remotely, maintaining the integrity of programs and data 
stored on computer systems has become an increasing 
concern. 

Prior art systems implement various physical and soft- 
ware systems to control access to the system and provide 

60 security. For example, computer systems handling classified 
information may require TEMPEST approval to avoid unin- 
tended radiation of information, be located in a secure 
facility such as a limited access area to provide physical 
security, and be operated in a stand alone configuration 

65 without provision for remote access to avoid remote hacker 
access. Physical security, however, cannot address remote 
access users so that a variety of software is used to establish 
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varying authorization levels for remote system use and 
access. For example, remote users may be required to 
interface via a secure access or "firewall" system which 
requires a user to establish authorization to access a com- 
puter system prior to providing a connection. A firewall may 
further monitor use of facilities, limiting access and use 
according to the user's authorization. Software on the com- 
puter system itself further monitors access using, for 
example, passwords, personal identification numbers (pins), 
etc. to control access and use. Other software may be 
implemented to protect, for example, certain area of memory 
such as the operating system from being altered or over- 
written. Some operating systems, for example, further limit 
write operations to particular areas of memory containing 
data used by a particular application and limit access to other 
areas of memory or alteration of instructions stored in 
memory. However, such software protections have often 
proved inadequate to stop a determined hacker from gaining 
unauthorized access and bypassing such safeguards. For 
example, a hacker might use another program to generate 
and try thousands or millions of access code combinations to 
break into a system. Alternatively, using a more conven- 
tional approach, a hacker might rummage through discarded 
company documents to obtain access code information, 
unlisted maintenance telephone numbers, etc. Access may 
also be obtained by "back doors" into the system otherwise 
used for maintenance, billing, and other non-remote access 
purposes. Hackers may also obtain access by implanting 
computer viruses into the system, often embedded in inno- 
cent appearing host software. Once implanted, the vims can 
damage the system directly or provide other methods of 
access for the hacker. 

In addition to remote covert action, computer systems are 
also subject to local attacks by, for example, disgruntled 
employees, etc. On a less sinister basis, computer systems 
are further subject to unintentional damage by human opera- 
tor error inadvertently deleting or modifying files and by 
program bugs in the system and applications having similar 
effects and results as that of intentional attacks on the 
system. 

For the foregoing reasons, there exists a need for a secure 
computer system architecture and method for providing 
computer security which cannot be easily bypassed by 
innocent or surreptitious means, either remotely or local to 
the computer system. A further need exists for a computer 
system and method of operating a computer system which 
preserves data and program integrity while providing for 
remote access to users having only read access, A still 
further need exists for a computer system and method of 
operating a computer system which prevents data and 
instruction corruption, modification and deletion by 
improper operation of host applications or due to the inten- 
tional actions of software viruses and other rogue executable 
code. 

SUMMARY OF THE INVENTION 

The present invention is directed to a computer system 
and method of operating a computer system which provides 
enhanced data and program security. A system and method 
according to the invention limit access to computer system 
storage media by providing a locally operable switch which 
selectively prevents alteration to the local storage media. 
The switch may be a manually operable mechanical device 
or may be electronic, so long as its operation is isolated from 
the system being protected, and may be entirely self- 
contained. For example, the appropriate control lines 
between a hard disk controller and the hard disk drive are 
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routed through a manually operable electrical switch which 
can only be manipulated locally and cannot be operated or 
bypassed under computer control. In one configuration, the 
appropriate write enabling conductor of the cable is physi- 

5 cally interrupted by the mechanical switch when in a secure 
mode and, instead, the appropriate write disabling signal is 
applied to the hard disk drive. This basic configuration and 
method can be applied to various computer system archi- 
tectures to support stand alone, multiuser and remote access 

10 capable computer systems. 

According to another aspect of the invention, a computer 
system includes dual processor elements, one isolated from 
remote access and having facilities for writing information 
to a storage device. The other processor element, while 

35 handling communications with remote devices, is connected 
so as to positively inhibit writing or altering data contained 
in the storage device. To further protect system integrity, 
another aspect of the invention configures the communica- 
tions processing element as a slave, receiving and executing 

20 instructions from the isolated processing element. The 
invention further divides data storing and retrieval functions 
between a pair of hard disk drives used to provide remote 
access. Using this division, remote users may read from one 
hard disk drive, but are incapable of altering the contents of 

25 the read only drive. Similarly, remote users can write to the 
other hard drive, but cannot read information stored by other 
users and cannot target information for alteration or destruc- 
tion. 

According to an aspect of the invention, a digital com- 
30 puter system includes a processor, a storage device and a 
manually operative switch. The storage device is responsive 
to the processor for selectively operating in a read mode of 
operation for reading previously stored data and in a write 
mode of operation for storing data. The manually operative 
35 switch selectively disables the processor from causing the 
storage device to operate in the write mode of operation. 
According to a feature of the invention, the manually 
operative switch is connected to interrupt the control signal 
required to cause the storage device to operate in the write 
40 mode of operation. The manually operative switch may be in 
direct electrical contact with the storage device and may be 
in the form of a mechanical switch or may be an electronic 
switch including control software and hardware compo- 
nents. 

45 

According to another feature of the invention, the pro- 
cessor includes a central processing unit, a controller which 
is in direct electrical contact with the manually operative 
switch, and a bus which connects the central processing unit 

5Q and the controller. 

According to another aspect of the invention, a digital 
computer system includes a storage device, first and second 
central processing units, and a first manually operative 
switch. The storage device is responsive to a control signal 

55 for selectively operating in a read mode of operation for 
reading previously stored data and in a write mode of 
operation for storing data. The first and second central 
processing units are each capable of providing this control 
signal. The switch then alternatively provides the control 

60 signal from either the first or second central processing unit 
to the storage device. According to a feature of the invention, 
the system further includes a second manually operative 
switch selectively disabling the storage device from oper- 
ating in the write mode of operation. 

65 According to another aspect of the invention, a digital 
computer system includes a processor, a secure data storage 
device and a manually operative switch. The secure data 
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device is responsive to a write control signal from the data alternatively, a write inhibiting control signal. In response to 

processor for selectively storing data. The switch is manu- the signal received from the switch, the data storage device 

ally selectable to enable and disable receipt by the secure selectively stores data or is inhibited from doing so. 

data storage device of the write control signal. According to a feature of the invention, the storage device 

According to a feature of the invention, the manually 5 is responsive to the control signal transmitted by the manual 

operative switch selectively applies a predetermined fixed switch for selectively operating in read and write modes of 

control signal to the secure data storage device instead of the operation so that the write-inhibiting control signal causes 

write control signal. The secure data device may be anon- the da ! a stora f de ^ lce t0 °P erate on l v m L lhe read m0< l e ° f 

volatile memory including a hard disk drive. °Pf ratl0n and/or ° the f r ^odes protecting the integrity of the 
. - r.. . ,m data ( e *S-> internal refresh only). 

According to another feature of the invention, a bus " According t0 anolner feature of the inventioil) the 

connects the processor to the secure storage device for cessof indudes a fifSt disk controller and lhe data stor F 

transmission of the control signal so that the manually device ^ a first disk drive According to another feature of 

operative switch selectively enables and disables a trans- the mve ntion, a second disk drive may also be connected to 

mission of the control signal along the bus. tne fi ret ^ controller or may be connected to its own, 

According to another feature of the invention, the pro- second disk controller, 
cessor includes a central processing unit and a disk control- According to another aspect of the invention a digital 
ler connected to each other by a system bus. The secure data computer system includes a processor, a storage device and 
device includes a disk drive electrically connected through a switch. The storage device is responsive to the processor 
the manually operative switch to the disk controller for for selectively operating in a plurality of operating modes 
receiving the control signal so that the manually operative including a read mode of operation for retrieving previously 
switch selectively enables and disables transmission of the stored data and a write mode of operation for storing data, 
control signal. Another disk drive may be included together The switch is operable to selectively enable and disable at 
with another disk controller connected to the system bus for least one of the operating modes, the switch being control- 
selectively writing data to and reading data from the addi- lable by means distinct and separate from the processor so 
tional disk drive in the form of, for example, an array of 2 that the processor is inhibited from controlling the operation 
multiple hard disk drives (e.g., redundant array of indepen- of the switch. According to a feature of the invention, the 
dent disks, or "RAID"). These additional disk drives may be switch may be manually operated to selectively make and 
connected independent of the manually operative switch or break an electrical conducting path connecting the processor 
may be connected with a second manually operative switch with the storage device. 

to prevent writing to the additional disks. Alternatively, the switch may include a controller, an 
According to another feature of the invention, the digital operation of which is independent of the processor for 
computer system further includes first and second disk selectively enabling and disabling at least one of the oper- 
controllers connected to respective master and slave central ating modes. At least one of the operating modes may be a 
processing units by a system bus. The secure data storage 35 read mode of operation and, alternatively, may be a write 
device includes a first disk drive electrically connected mode of operation. According to a feature of the invention, 
through the manually operative switch to the first disk a second "master" processor is isolated from the first pro- 
controller for receiving a control signal from the master cessor and both (i) controls the switch and (ii) reads and 
central processing unit whereby the manually operative writes to the storage device. 

switch selectively enables and disables transmission of the 4Q According to another feature of the invention, the storage 

control signal to the first disk drive. The second disk drive device may include a magnetic media and comprise a disk 

is connected to the second disk controller and is accessible drive or a magnetic tape. The storage device may altema- 

by the master and slave central processing units over the tively include a non-volatile electronic memory device, such 

system bus. Alternatively, the first and second disk control- as an EEPROM. 

lers may be included on separate buses accessible only by 45 According to still a further feature, the storage device may 

the respective master and slave central processing units. include an optical storage device such as a CD-ROM or an 

According to another feature of the invention, a second electro-optical source device such as a CD-RW. 

manually operative switch is interposed between the second According to still another feature of the invention, the 

disk drive and the second disk controller to selectively digital computer includes a processor with a first memory 

disable reading from or, in an alternate configuration, writ- 50 storing program instructions and a distinct and separate 

ing to the second disk drive. memory storing data. The first memory may be operable in 

According to another feature in the invention, the com- the read only mode of operation in which the program 

puter includes a third disk controller and disk drive with the instructions are protected from alteration and erasure by the 

disk drive operative to mirror data stored in the second disk central processing unit. 

drive. 55 According to another aspect of the invention, a method of 

According to another feature in the invention, the com- operating a digital computer system includes the steps of 

puter system includes a first program memory connected to supplying a variable control signal to a disk drive and 

and storing instructions executable by the master central writing data to the disk drive in response to the variable 

processing unit. A second program memory is connected to control signal. A manual electrical switch is operated so as 

and stores instructions executable by the slave processing 60 to disconnect the variable control signal from the disk drive 

unit with a processor bus connecting the master and slave and instead, connect a fixed control signal to the disk drive, 

central processing units. A communications controller may The disk drive is then operated in a mode other than a write 

be connected to the system bus to provide for remote access. mode of operation in response to the fixed control signal. 

According to another aspect of the invention, a computer According to a feature of the method, remote access to the 

system includes a processor, a manual switch and a data 65 disk drive is provided only when operating in the mode other 

storage device. The switch is connected to selectively trans- than the write mode of operation, i.e., in the secure mode 

mit a control signal received from the processor and, inhibiting changes to the hard disk drive. 
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These and other features, aspects and advantages of the to selectively inhibit operation in the write (or read) mode so 

present invention will become better understood with regard that, effectively, hard disk drive 110 can be operated in either 

to the following description, claims and accompanying a read/write mode, or in a read only or write only mode of 

drawings. operation. 

5 If switch 202 is included as part of SCSI controller 108, 

BRIEF DESCRIPTION OF THE DRAWINGS then it is connected to inhibit write requests from CPU 104 

Tnn % « ui i, a- « f * ♦ ~ j • (or other devices) from being sent to hard drive 110. If 

FIG. 1 is a block diagram or a computer system according u i M ■ ■ ♦ j ■ . j ■ . u j j • ha •, 

to the rior art switch 202 is instead incorporated into hard drive llO, it can 

o e prior a . ^ e connected to inhibit operation of hardware used to 

FIG. 2 is a block diagram of a computer system according iQ operate the disk drive > s write headSt For example, the switch 
to the invention including a switch for inhibiting a hard disk 202 can be configured to cut power to a write head's output 
drive from operating in a write mode of operation and circuitry. Preferably, hard disk drive HO and/or SCSI con- 
segmented main memory. lroller l08 prov ide the appropriate status and/or error mes- 

FIG. 3 is a pin-out diagram and table for an IDE connec- sages to CPU 104 when operating in a write inhibited or read 

tor. l5 only mode of operation or when a write operation is 

FIGS. 4a and 4b are front and rear views of a stand alone requested and the write mode has been disabled, 

switch device for insertion between a SCSI controller and Switch 202 may also be configured as an auxiliary, stand 

one or more SCSI devices. alone device mounted in a switch box enclosure with 

FIG, 5 is a flow diagram for a software implemented appropriate terminals for connecting controller 108 to hard 

switch for restricting operation of designated peripheral 20 disk drive 110. In this configuration, switch 202 is operative 

devices to programmed modes of operation. in a first read/write position to pass signals from controller 

FIG. 6 is a block diagram of a computer system according 108 t0 ■ hard 1 disk drive f U ° wi ( thout chan f ^ a w ' ite \ nhibit 

to another embodiment of the invention including a switch or read ^ mode ° f °^T° n : ^ 2 f. Wl 1 ° 0t 

for connecting a storage unit to a stand alone processing unit sl S nak fr0m ,™^ * M * ^ ^ ?? VC 110 Wh * h 

* n o f ^ romnta .>™ co 25 would cause hard disk drive 110 to be placed in a write mode 

or to a processor providing ror remote access. <• ^ , . r • r 

of operation. For example, pin 50 of a SCSI interface may 

FIG. 7 is a block diagram of a computer system according be set to the appropriate logic level when a device 

to another embodiment including isolated (i) secure local ^ accessed SQ ^ tQ hmit operatioD of the selected device t0 

and (ii) remote processing systems sharing common hard either a fead Qr write mode as appropriate . Alternatively, 

disk facilities under the exclusive control of the secure local 3o switch 202 may be connected between IDE controller 120 an 

processor. bard d j g j £ dr | ve ^22 to selectively restrict access and control 

FIGS. 8a and 8b are front and rear views of a switching 0 f the latter. Using an IDE interface, a pin-out diagram for 

device for selectively connecting one of two SCSI control- wn ich is shown in FIG. 3 of the drawings, write strobes from 

lers to a plurality of SCSI devices and for limiting operation tne controller are transmitted to the hard drive on pin 23. 

of those SCSI devices to programmed modes of operation 35 That is, the controller signals the hard drive that data 

when connected to the second of the SCSI controllers. supplied on pins 3-18 is ready to be written by driving a 

FIG. 9 is a block diagram of a computer system according control signal applied at pin 23 to a "low" logic level. Thus, 

to the invention including a master/slave architecture. in a secure mode of operation wherein writing to a hard drive 

™«™ TOTirkKT tlic ^ to ^ e mn ^i ted > P m 23 is connected to a high level logic 

DE p^ilp^^™^^ put 40 signal source 50 lhat lhe hard disk drive does not receive the 

PREFERRED EMBODIMENT write strobe s j gna i necessary to cause it to perform a write 

Referring to FIG. 2 of the drawings, a computer system operation. 
200 includes conventional devices 102-124 as discussed in Alternatively, switch 202 may include appropriate hard- 
connection with the prior art with the (i) addition of switch ware and software to monitor signals transmitted by con- 
202 interposed between SCSI controller 108 and hard disk 45 trailer 108 to hard disk drive 110. Write (or other inhibited 
drive 110 and (ii) partitioning of main memory into separate actions such as read, erase, etc.) commands to one or more 
instruction memory 106a and data memory 106b. Instruc- designated devices would be recognized and intercepted, 
tion memory 106a may include various forms and levels of switch 202 generating an appropriate error message back to 
protection. For example, instruction memory 106a may be controller 108. Permissible operations would be transmitted 
implemented in the form of an EEPROM with a manual 50 through to disk drive 110 without impediment. In this 
erase and programming feature. Thus, CPU 104 would have software implementation of switch 202, predetermined por- 
read-only access to instruction memory 106a unless and tions of disk drive 202 may be designated as secure so that 
until the associated EEPROM was manually provided with write commands are selectively inhibited only to designated 
the proper control signals to allow its programming. This tracks, sectors, clusters, etc. 

feature prevents unauthorized modification of programming 55 FIGS. 4a and 4b show a stand alone, programmable 

and provides security against viruses attacking the program embodiment of switch 202 which can accommodate eight 

code. In contrast, data memory 1066 is a conventional RAM peripheral devices on a SCSI interface. Switch 202 is 

for the temporary storage of data, including system and mounted in enclosure 210 and includes panel mounted 

application program parameters and variables. programming switches 212a-212/i associated with respec- 

Switch 202 may be configured as a part of SCSI controller 60 tive SCSI devices 0-7. Each of the programming switches is 

108, hard disk drive 110, or as a separate auxiliary device. selectable to designate a read only, read/write, or write only 

Switch 202 may be exclusively manually operable to inhibit mode of operation for the respective device. Once 

a hard disk drive from altering or erasing data. Alternatively, programmed, the status of each device is indicated by a 

switch 202 may be an electronic switch controlled by a tricolor LED 214 associated with each switch, green, for 

control signal physically inaccessible to or by CPU 104. 65 example, indicating read/write capabilities, yellow that the 

Typically, hard disk drive 110 is responsive to read and write corresponding device can be operated in a read only mode 

requests from SCSI controller 108. Switch 202 is effective of operation (write-inhibited), and red indicating that the 
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corresponding device is operable in a write only mode of 
operation (i.e., read operations are inhibited). As shown in 
FIG. 4a, devices 0 and 1 are being operated in write only 
modes (i.e., a "secure" mode), devices 2 and 4 in read only 
modes (another "secure" mode), and devices 3, 5, 6, and 7 
in read/write modes (i.e., are not being operated as "secure" 
devices). 

Akey switch 216 may be included to control the operation 
of switch 202. In the "OUT" mode, the switch is functionally 
inoperative so that the operations of all devices are unre- 
stricted as would be indicated by green status lights 214. In 
the "SECURE" mode, the programmed mode limits would 
be effective to limit read and write modes of operations. The 
"SET" mode is used to program switch 202 according to 
switches 2Ha-212h. A corresponding key (not shown) is 
removable from key switch 216 in the "OUT" and 
"SECURE" positions so that switch 202 can be left locked 
and unattended. Preferably, the "SET" position of key 
switch 216 is a temporary position with a spring returned to 
the "SECURE" position upon completion of programming. 
When switch 216 is in the "SET" mode, the position of 
switches 212a-212/i are read and the corresponding mode 
limitations are stored in memory as would be indicated by 
status indicators 214. 

A rear view of switch 202 is presented in FIG. 46 
including panel mounted SCSI connectors 220 and 222 for 
connecting the switch to a SCSI controller and to SCSI 
devices being controlled, respectively. 

The operation of switch 202 is shown in the flow diagram 
of FIG. 5. The program starts at entry point 300 with an 
initial decision box 302 handling the set mode of operation 
for programming the device. If switch 202 is in the "SET" 
mode, then the positions of switches 212a-212A are read at 
process 304 and the corresponding limitations are stored in 
memory at process 306. If the SET operation has not been 
activated, or upon completion of the programming, process- 
ing continues at step 308 where the numbers of the secure 
devices are read from memory together with the correspond- 
ing allowed modes or inhibited modes of operations, as 
appropriate. In response to receipt of a control signal at 
decision 310, the program decides if the control signal is 
directed to a secure device, i.e., a device number previously 
stored as being operated in a "SECURE" mode with either 
read or write operations inhibited. If the control signal is 
directed to a device which is not subject to read or write 
limitations, such as devices 3, 5, 6, and 7 according to FIG. 
4a, then the control signal is transmitted to that device at 
process 316. However, if the control signal is directed to a 
device which is being operated in a secure mode of operation 
(devices 0, 1, 2, and 4 in this example), then the process 
determines at decision box 312 if the requested operation 
has been inhibited. For example, device numbers 0 and 1 are 
being operated in a read-inhibited mode while devices 2 and 
4 are being operated in a write-inhibited mode. Accordingly, 
read requests directed to devices 0 or 1 and write requests 
directed to devices 2 and 4 would result in the left branch 
being taken out of decision point 312 where the appropriate 
control signal would be inhibited and an error message 
generated back to the requesting controller. Conversely, if 
the operation requested has not been inhibited, the right 
branch is taken out of decision box 312 and the request is 
transmitted to the device address. In either case, process 
flow continues down to terminal 318. At this point, the 
process would conventionally loop back to Start 300 to 
continuously detect and process programming requests and 
SCSI interface commands. 

Another embodiment of the invention is shown in FIG. 6 
depicting a dual processor system, with both read/write and 
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read only hard drives, each having a dedicated bus, local 
memory and storage. A hard drive storage system is swit- 
chable between the processors. The hard drive storage 
system includes two disk drives, operable in a non-secure 

5 normal mode of operation in which both drives are read/ 
write enabled, and in a protected mode wherein one drive is 
operated in a read only mode and the other in a write only 
mode of operation. In this configuration, the two processors 
are isolated from each other, one of the processors providing 

J0 for local system operation, the other providing remote 
access to the mass storage devices including hard disk 
drives. In effect, the system is equivalent to two separate 
independent systems on one motherboard when configured 
as a personal computer (PC). Both systems require software 
to be loaded, and some system configuration to be per- 

15 formed. Communications between the processors is pro- 
vided by the common hard drive storage system. 

Operator monitoring of the system performance and 
downloading of data acquired by the system is performed by 
a primary CPU 104a connected to a first local system bus 

20 102a. The second local data bus 102b supports a commu- 
nications CPU 1046. Connected to both buses 102a and 
1026, switch 204 physically switches SCSI controller 1086 
between the two buses. Hard disk drives 110a and 1106 are 
connected and controlled by SCSI controller 1086 through 

25 write mode disabling switch 202a and read mode disability 
switch 2026, respectively. Switchable SCSI controller 1086 
would be switched to main local system bus 102a for 
loading and configuration of software under control of main 
CPU 104a. After loading and testing of software, SCSI 

30 controller 1086 would be switched to local system bus 1026 
supporting communications with remote users over serial 
port 1186 and Ethernet 124, Hard disk drive 110a would be 
then operated in a read only mode of operation by switch 
202a. Conversely, hard disk drive 1106 would be operated in 
a "write only" mode of operation so that, for example, any 

35 uploaded data could be checked for viruses prior to that data 
becoming available to the system. Further, by placing hard 
disk drive 1106 in a "write only" mode of operation using 
switch 2026, data uploaded to the drive by remote users of 
the system cannot be accessed by other remote users thereby 

40 enhancing system security. This feature is particularly useful 
for e-commerce applications where confidential data 
received from remote user must be protected from unautho- 
rized dissemination (e.g., credit card information, etc.). 
In the configuration of FIG. 6, the primary CPU 104a and 

45 associated first bus 102a are inaccessible to remote users. 
Accordingly, switch 204 and switches 202a and 2026 may 
be electronically controlled by primary CPU 104a without 
jeopardizing the security of the system. This feature is 
incorporated into the configuration shown in FIG. 7 wherein 

50 SCSI controllers 108c and 108a* are connected to respective 
first and second buses 102a and 1026. Switch 206 is 
controlled by CPU 104a via serial port 1186 connected to 
first bus 102a. Switch 206 selectively connects either SCSI 
controller 108c or 108a* to SCSI hard disk drives 110a and 

55 1106. 

In a local mode of operation, switch 206 provides unlim- 
ited access by local SCSI controller 108c to hard disk drives 
110a and 1106. Thus, CPU 104a can both read from and 
write to the drives. Upon being commanded to connect the 

60 drives to second bus 1026 to support remote access, switch 
206 disconnects SCSI controller 108c and connects SCSI 
controller 108a* to the drives subject to preprogrammed 
operating mode limitations. For example, when being 
accessed by SCSI controller 108a*, hard disk drive 110a may 

65 be write inhibited while hard disk drive 1106 may be read 
inhibited as described in connection with the configuration 
of FIG. 6. 
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FIGS. Ha and 86 show an alternate implementation of a a read only mode of operation. Hard disk drive 110b 

stand alone switch 210 suited to the dual processor system supports storage of data by remote users, such as required 

shown in FIG. 7. The output of SCSI controller 108c, which for e-commerce, etc. 

is connected to local system bus 102a, is provided to According to the invention as illustrated by the embodi- 

connector 230 while SCSI controller 108a", which is con- 5 meats described, the capability of writing to and altering 

nected to local system bus 1026, is connected to connector data is disabled for remote users by disabling hard disk write 

232. A serial connector 236 provides an interface for capabilities, limiting remote user access to a dedicated and 

optional computer control of the switch. segregated data processor and associated bus and data 

t -i ■ a . , , 4U - 4 , . , . storage, and by isolating control of a communications pro- 

In this configuration, switch 210 both switches hard disk .l / • . i . i * j 

j . 11A * imi. i_ • . o^oi m cessor so that instructions are only executed as received 

drives UOa and 1106 between the appropriate SCSI con- 10 from & mveation fufther 

troller and selectively operates the hard disk drives in the enhances security by segregating read and write functions to 

pre-programmed restricted modes of operation. As shown, differem hard drives ^ thal remote ^ caQnot aUer 

key switch 218 has five positions including "EXTERNAL.", information previously stored on the system nor can they 

"OFF*, "LOCAL", "REMOTE", and "SET". In the "OFF" rea d irjfcrmation stored by other remote users, 

mode, neither of the SCSI controllers have access to periph- is the present invention has been described in 

eral devices including the hard disk drives. In the "LOCAL" considerable detail with reference to certain preferred 

position, signals from and to connector 230 are passed embodiments thereof, other embodiments or configurations 

through without alteration to SCSI devices connected to a re possible. For example, the mode limiting switch is 

connector 234. This mode is applicable to unrestricted applicable to other storage devices and media and to other 

operation of the peripheral devices when under control of 20 devices where selection and control of operating modes 

primary CPU 104a which is inaccessible by remote users. must be restricted. For example, a restricted user may be 

When key switch 218 is placed in the "REMOTE" limited by the switch to monitoring the output of a device 

position, connector 232 provides access to SCSI devices such as a video camera, while a local user may additionally 

connected at connector 234 under the control and supervi- control the camera. Similarly, the switch may be used in-line 

sion of switch 210 to selectively inhibit predetermined 25 with a printer to allow limited printing capabilities for 

modes of operation according to stored programming and as certain users while providing full capabilities to local users 

indicated by status indicator lights 214. As previously of the system. Accordingly, the spirit and scope of the 

described, a temporary, spring loaded "SET" position is appended claims should not be limited to the description of 

provided for programming switch 210 according to the the preferred embodiments contained herein, 

positions of switches 2\2a-2\Zh. 30 What is claimed is: 

The "EXTERNAL" position allows a secure device, such 1- A digital computer system comprising: 

as primary CPU 104a, to program and control switch 204 via first and second electrically isolated buses; 

a serial RS-232 interface. Thus, so long as the security of first and second independent central processing units 

primary CPU 104a is not breached, the operating integrity of 35 connected to a respective one of said first and second 

switch 202 is maintained. buses; 

Another embodiment in the invention including dual a storage device connected to each of said buses for 

processors in a master/slave relationship is shown in the selectively storing data; and 

block diagram of FIG. 9. According to this embodiment, one a manually operative switch selectively controlling access 
processor manages communications including, for example, 40 by said first central processing unit to inhibit storing 
responding to requests from the Internet. However, the slave data to said storage device by said first central process- 
processor only accepts program instructions from the pri- ing unit without inhibiting storing data by said second 
mary processor. This can be accomplished by appropriate central processing unit. 

programming of the system firmware (e.g., BIOS) of the 2. The digital computer system according to claim 1 

slave processor. Thus, the slave processor is controlled only 45 wherein said storage device is operable in (i) a read mode of 

by the master processor and would not be accessible by a operation for reading previously stored data and (ii) a write 

remote computer hacker. mode of operation for storing said data. 

Referring to FIG. 9, a master central processing unit 104a 3. The digital computer system according to claim 2 
is connected to dedicated main memory 106a including an wherein said manually operative switch is connected to both 
operating system. Master central processing unit 104a is 50 said first and second buses to selectively operate said storage 
connected via local system bus 102a to various devices device in a write -only protected mode of operation, 
including (1) hard disk drive 110a through SCSI controller 4. The digital computer system according to claim 1 
108a; (2) video control board 114 and video monitor 116; (3) further comprising an interprocessor bus, said first central 
serial port 118a; and (4) hard disk drive 122 through IDE processing unit comprising a master central processing unit 
controller 120. Slave central processing unit 104b provides 55 and said second central processing unit comprising a slave 
remote access functions and is connected to a local main central processing unit, said master and slave central pro- 
memory 1066. Central processing unit 1046 connects to ccssing units connected to each other by said interprocessor 
SCSI controller 1086, serial port 1186 and Ethernet 124 Dus and connecting to respective ones of said first and 
through local system bus 1026. In turn, SCSI controller 1086 second buses, said manually operative switch connected to 
connects to hard disk drive 1106 and, via selectable "read 60 Dotn said first and second buses and connected to selectively 
only" switch 202, to hard disk drive 110c. As previously transmit to said storage device a control signal requited to 
mentioned, slave central processing unit 1046 obtains oper- cause said storage device to operate in said write mode of 
ating instructions exclusively from master central process- operation. 

ing unit 104a so that viruses or other changes cannot be 5. A digital computer system comprising: 

remotely made to its operating instructions or programming. 65 first and second independent local buses; 

Critical data that is to be protected from change or deletion first and second storage devices, each responsive to a 

by remote users is stored in hard disk drive 110c operated in control signal for selectively operating in (i) a read 
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mode of operation for reading previously stored data 
and (ii) a write mode of operation for storing data; 

first and second central processing units respectively 
connected to said first and second local buses, each of 
said first and second central processing units capable of 
providing said control signal; 

a first manually operative switch alternatively providing 
said control signals from said first and second local 
buses to said first and second storage devices, said 
switch further configured to selectively operate said 
first and second storage devices in a protected mode of 
operation, said protected mode of operation including 
at least one of a write-only and read-only mode of 
operation. 

6. The digital computer system according to claim 5 
further comprising a second manually operative switch 
selectively disabling at least one of said first and second 
storage devices from operating in said write mode of opera- 
tion. 

7. The digital computer system according to claim 5 
farther comprising second and third switches, said second 
switch selectively inhibiting said first storage device from 
operating in said write mode of operation, said third switch 
selectively inhibiting said second storage device from oper- 
ating in said read mode of operation. 

8. The digital computer system according to claim 7 
further comprising a communications interface providing 
remote access to said second local bus. 

9. The digital computer system according to claim 5 
further comprising switching means having a first state 
wherein said first and second storage devices are operable in 
both said read and write modes of operation and a second 
state inhibiting operation of said first storage device in said 
write mode and of said second storage device in said read 
mode. 

10. The digital computer system according to claim 9 
further comprising a communications interface providing 
remote access to said second central processing unit. 

11. The digital computer system according to claim 5 
further comprising switching means having a first state 
wherein said first and second disk storage devices are 
operable in both said read and write modes and a second 
state causing said first storage device to be operated only in 
said read mode of operation and said second storage device 
only in said write mode of operation. 

12. The digital computer system according to claim 11 
further comprising a communications interface providing 
remote access to said second central processing unit. 

13. A digital computer system comprising: 

first and second system buses electrically independent of 
each other; 

master and slave central processing units connected to 
respective ones of said system buses; 

first and second controllers respectively connected to said 
master and slave central processing units by respective 
ones of said system buses; 

a data storage device responsive to a write control signal 
from one of said master and slave processing units on 
a respective one of said first and second system buses 
for selectively storing data said data storage device 
including first and second storage devices; and 

a manually operative switch selectively enabling and 
disabling receipt by said data storage device of said 
write control signal from said first and second system 
buses. 

14. The digital computer system according to claim 13 
wherein said manually operative switch selectively connects 
said data storage device to one of said first and second 
controllers. 
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35 



45 
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55 
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65 



15. The digital computer system according to claim 13 
wherein said manually operative switch is operative to 
selectively cause said data storage device to operate in a 
protected mode including a read-only and a write-only mode 
of operation independent of a mode control signal provided 
by one of said master and slave central processing units. 

16. The digital computer system according to claim 13 
wherein said manually operative switch is operative to 
selectively cause said data storage device to operate in a data 
protected mode including one of a read-only and write-only 
mode of operation independent of a mode control signal 
provided by one of said master and slave central processing 
units. 

17. The digital computer system according to claim 13 
further comprising a bus connecting one of said master and 
slave central processing units to said data storage device for 
transmission of said control signal wherein said manually 
operative switch selectively enables and disables a trans- 
mission of said control signal along one of said first and 
second buses. 

18. The digital computer system according to claim 17 
wherein said data storage device comprises a hard disk 
drive. 

19. A digital computer system comprising: 

a first data processing unit including a first central pro- 
cessing unit and a first disk controller connected to each 
other by a first system bus; 

a second data processing unit including a second central 
processing unit and a second disk controller connected 
to each other by a second system bus, said second 
system bus electrically independent of said first system 
bus; 

a secure data storage device responsive to a write control 
signal from each of said first and second data process- 
ing units for selectively storing data, said secure data 
storage device comprising a first disk drive; and 

a manually operative switch selectively enabling and 
disabling receipt by said secure data storage device of 
said write control signal. 

20. The digital computer system according to claim 19 
wherein said first disk drive comprises an array of hard disk 
drives. 

21. The digital computer system according to claim 19 
further comprising another disk drive connected to one of 
said first and second disk controllers independent of said 
manually operative switch. 

22. The digital computer system according to claim 19 
wherein said first disk drive is electrically connected through 
said manually operative switch to said first disk controller 
for receiving said control signal whereby said manually 
operative switch selectively enables and disables a trans- 
mission of said control signal, 

said digital computer system further comprising a second 
disk drive; and a second disk controller connected to 
said second system bus and to said second disk drive 
for selectively writing data to and reading data from 
said second disk drive. 

23. A digital computer system comprising: 
master and slave central processing units; 

master and slave system buses electrically isolated from 
each other and respectively connected to said master 
and slave central processing units; 

a secure data storage device responsive to a write control 
signal from each said master and slave central process- 
ing units for selectively storing data; 

a manually operative switch configured to selectively 
enable and disable receipt by said secure data storage 
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device of said write control signal so as to selectively 
operate said secure data storage device in a read-only 
mode of operation; and 
first and second disk controllers connected to said master 
and slave system buses, said secure data storage device 5 
including a first disk drive electrically connected 
through said manually operative switch to said first and 
second disk controllers for receiving said write control 
signal from one of said master and slave central pro- 
cessing units whereby said manually operative switch 10 
selectively enables and disables transmission of said 
write control signal. 

24. The digital computer system according to claim 23 
further comprising a second disk drive connected to said 
second disk controller. 35 

25. The digital computer system according to claim 23 
further comprising: 

a first program memory connected to and stoning instruc- 
tions executable by said master central processing unit, 

a second program memory connected to and storing 
instructions executable by said slave central processing 
unit, and 

a processor bus connecting said master and slave central 
processing units. 25 

26. The digital computer system according to claim 23 
further comprising a communications controller connected 
to said slave system bus. 

27. A digital computer system comprising: 

a first central processing unit; 30 
a first system bus connected to said first central processing 
unit; 

a second central processing unit; 

a second bus connected to said second central processing 35 

unit and centrically isolated from said first system bus; 
a disk controller; 

a first manual switch selectively providing a conductive 
path between said disk controller and, in a first position, 
said first system bus and, in a second position, said 40 
second system bus; and 

a hard disk drive connected to said disk controller and 
responsive to a write control signal from said disk 
controller for selectively storing information. 

28. The digital computer system according to claim 27 45 
further comprising a second manual switch interposed 
between said disk controller and said hard disk drive for 
selectively transmitting said write control signal therebe- 
tween so as to selectively permit an operation of said hard 
drive in a read-only mode of operation. 50 

29. A digital computer system comprising: 
a first system bus; 

a second system bus 

a first processor connected to said first system bus; 55 

a second processor connected to said second system bus; 

a data storage device connected to said first and second 
system buses for selectively operating in a plurality of 
operating modes so as to access said data storage 
device; and 60 

a switch operable to selectively enable and disable at least 
one of said operating modes, said switch controllable 
by means distinct and separate from at least one of said 
processors whereby said one processor is inhibited 
from controlling said operation of said switch. 65 

30. The digital computer system according to claim 29 
wherein said switch comprises a manually operated switch 



connected to selectively make and break an electrically 
conducting path connecting of said first and second system 
base one processor and said data storage device. 

31. The digital computer system according to claim 29 
wherein said switch comprises a digital controller, an opera- 
tion of which is independent of said second processor for 
selectively enabling and disabling said at least one of said 
operating modes. 

32. The digital computer system according to claim 29 
wherein said data storage device is operable in (i) a read- 
only mode of operation for retrieving previously stored data 
and (ii) a write-only mode of operation for storing data. 

33. ITie digital computer system according to claim 32 
wherein said at least one of said operating modes is said 
read-only mode of operation. 

34. The digital computer system according to claim 32 
wherein said at least one of said operating modes is said 
write-only mode of operation. 

35. The digital computer according to claim 32 wherein 
said data storage device comprises a magnetic media. 

36. The digital computer according to claim 32 wherein 
said data storage device comprises a disk drive. 

37. The digital computer according to claim 32 wherein 
said data storage device comprises a magnetic tape. 

38. The digital computer according to claim 32 wherein 
said data storage device comprises a non-volatile electronic 
memory device. 

39. The digital computer according to claim 38 wherein 
said electronic non-volatile electronic memory device com- 
prises an EEPROM. 

40. The digital computer according to claim 32 wherein 
said data storage device comprises an optical storage device. 

41. The digital computer according to claim 32 wherein 
said data storage device comprises an electro-optical storage 
device. 

42. The digital computer according to claim 32 wherein 
each of said first and second processors include a central 
processing unit, a first memory storing program instructions 
and a second memory, separate and distinct from said first 
memory, storing data. 

43. The digital computer according to claim 33 wherein at 
least one of said first memories is operable in a read-only 
mode of operation in which said program instructions are 
protected from alteration and erasure by a corresponding one 
of said central processing units. 

44. A method of accessing a digital storage device using 
a digital computer system, the digital computer system 
including first and second independent local buses, first and 
second central processing units respectively connected to 
said first and second local buses, and a manually operative 
switch, the method comprising the steps of: 

transmitting control signals from said first and second 
central processing units to respective ones of said fist 
and second local buses; 

operating said switch to alternatively provide ones of said 
control signals from said first and second local buses to 
the digital storage device and to select a protected mode 
of operation thereof; 

selectively operating the digital storage device in said 
protected mode of operation, said protected mode of 
operation including at least one of a write-only and 
read-only mode of operation; and 

selectively operating said digital storage device respon- 
sive to said control signals in (i) a read mode of 
operation for reading previously stored data and (ii) a 
write mode of operation for storing data, 

45. A method of accessing a digital storage device using 
a digital computer system, the digital computer system 
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including first and second system buses electrically inde- 
pendent of each other, master and slave central processing 
units connected to respective ones of said system buses, and 
a manually operative switch, the method comprising the 
steps of: 

transmitting a write control signal from one of said master 
and salve processing units; 



18 



selectively storing data on said data storage device 
responsive to said write control signal; and 

operating said switch to selectively enable and disable 
receipt by the data storage device of said write control 
signal from said first and second system buses. 
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ABSTRACT 



A method of executing coded instructions in a dynamically 
configurable multiprocessor having shared execution 
resources including steps of placing a first processor in an 
active state upon booting of the multiprocessor. In response 
to a processor create command, a second processor is placed 
in an active state. When either the first or second processor 
encounter a cache miss that has to be serviced by off-chip 
cache the processor requiring service is placed in nap state 
in which instruction fetching for that processor is disabled. 
When either the first or second processor encounter a cache 
miss that has to be serviced by main memory, the processor 
requiring services I placed in a sleep state by flushing all 
instructions from the processor in the sleep state and dis- 
abling instruction fetching for the processor in the sleep 
state. 

12 Claims, 10 Drawing Sheets 
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1 

METHOD OF EXECUTING CODED 
INSTRUCTIONS IN A MULTIPROCESSOR 
HAVING SHARED EXECUTION RESOURCES 
INCLUDING ACTIVE, NAP, AND SLEEP 
STATES IN ACCORDANCE WITH CACHE 
MISS LATENCY 

CROSS-REFERENCES TO RELATED 
APPLICATIONS 

The subject matter of the present application is related to 
that of co-pending U.S. patent application Ser. No. 08/881, 
958 identified as Docket No. P2345/37 178 .830071. 000 for 
AN APPARATUS FOR HANDLING ALIASED 
FLOATING-POINT REGISTERS IN AN OUT-OF- 
ORDER PROCESSOR filed concurrently herewith by 
Ramesh Panwar; Ser. No. 08/881,729 identified as Docket 
No. P2346/37178.830072.000 for APPARATUS FOR PRE- 
CISE ARCHITECTURAL UPDATE IN AN OUT-OF- 
ORDER PROCESSOR filed concurrently herewith by 
Ramesh Pan war and Arjun Prabhu; Ser. No. 08/881,726 
identified as Docket No. P2348/37 178 .830073.000 for AN 
APPARATUS FOR NON-INTRUSIVE CACHE FILLS 
AND HANDLING OF LOAD MISSES filed concurrently 
herewith by Ramesh Panwar and Ricky C. Hetherington; 
Ser. No. 08/881,908 identified as Docket No. P2349/ 
37178.8300074.000 for AN APPARATUS FOR HAN- 
DLING COMPLEX INSTRUCTIONS IN AN OUT-OF- 
ORDER PROCESSOR filed concurrently herewith by 
Ramesh Panwar and Dani Y. Dakhil; Ser. No. 08/882,173 
identified as Docket No. P2350/37 178. 830075.000 for AN 
APPARATUS FOR ENFORCING TRUE DEPENDEN- 
CIES IN AN OUT-OF-ORDER PROCESSOR filed concur- 
rently herewith by Ramesh Panwar and Dani Y. Dakhil; Ser. 
No. 08/881,145 identified as Docket No. P2351/ 
37178.830076.000 for APPARATUS FOR DYNAMI- 
CALLY RECONFIGURING A PROCESSOR filed concur- 
rently herewith by Ramesh Panwar and Ricky C. 
Hetherington; Ser. No. 08/881,732 identified as Docket No. 
P2353/37178.830077.000 for APPARATUS FOR ENSUR- 
ING FAIRNESS OF SHARED EXECUTION 
RESOURCES AMONGST MULTIPLE PROCESSES 
EXECUTING ON A SINGLE PROCESSOR filed concur- 
rently herewith by Ramesh Panwar and Joseph I. Chamdani; 
Ser. No. 08/882,175 identified as Docket No. P2355/ 
37178.830078.000 for SYSTEM FOR EFFICIENT IMPLE- 
MENTATION OF MULTI-PORTED LOGIC FIFO 
STRUCTURES IN A PROCESSOR filed concurrently here- 
with by Ramesh Panwar; Ser. No. 08/882,311 identified as 
Docket No. P2365/371 78.830080.000 for AN APPARATUS 
FOR MAINTAINING PROGRAM CORRECTNESS 
WHILE ALLOWING LOADS TO BE BOOSTED PAST 
STORES IN AN OUT-OF-ORDER MACHINE filed con- 
currently herewith by Ramesh Panwar, P. K. Chidambaran 
and Ricky C. Hetherington; Ser. No. 08/881,731 identified 
as Docket No. P2369/37178.830081.0OO for APPARATUS 
FOR TRACKING PIPELINE RESOURCES IN A SUPER- 
SCALAR PROCESSOR filed concurrently herewith by 
Ramesh Panwar; Ser. No. 08/882,525 identified as Docket 
No. P2370/37178.830082.000 for AN APPARATUS FOR 
RESTRAINING OVEREAGER LOAD BOOSTING IN AN 
OUT-OF-ORDER MACHINE filed concurrently herewith 
by Ramesh Panwar and Ricky C. Hetherington; Ser. No. 
08/882,220 identified as Docket No. P2371/ 
37178.830083.000 for AN APPARATUS FOR HANDLING 
REGISTER WINDOWS IN AN OUT-OF-ORDER PRO- 
CESSOR filed concurrently herewith by Ramesh Panwar 
and Dani Y. Dakhil; Ser. No. 08/881,847 identified as 
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Docket No. P2372/37178.830084.000 for AN APPARATUS 
FOR DELIVERING PRECISE TRAPS AND INTER- 
RUPTS IN AN OUT-OF-ORDER PROCESSOR filed con- 
currently herewith by Ramesh Panwar; Ser. No. 08/881,728 

5 identified as Docket No. P2398/37 178.830085.000 for 
NON-BLOCKING HIERARCHICAL CACHE 
THROTTLE filed concurrendy herewith by Ricky C. Heth- 
erington and Thomas M. Wicki; Ser. No. 08/881,727 iden- 
tified as Docket No. P2406/37178.830086.000 for NON- 

10 THRASHABLE NON-BLOCKING HIERARCHICAL 
CACHE filed concurrently herewith by Ricky C. 
Hetherington, Sharad Mehrotra and Ramesh Panwar; Ser. 
No. 08/881,065 identified as Docket No. P2408/ 
37178.830087.000 for IN-LINE BANK CONFLICT 

is DETECTION AND RESOLUTION IN A MULTI-PORTED 
NON-BLOCKING CACHE filed concurrendy herewith by 
Ricky C. Hetherington, Sharad Mehrotra and Ramesh Pan- 
war; and Ser. No. 08/882,613 identified as Docket No. 
P2434/37178.830088.000 for SYSTEM FOR THERMAL 

20 OVERLOAD DETECTION AND PREVENTION FOR AN 
INTEGRATED CIRCUIT PROCESSOR filed concurrently 
herewith by Ricky C. Hetherington and Ramesh Panwar, the 
disclosures of which applications are herein incorporated by 
this reference, identified as Docket No. P2406/ 

25 37178.830086.000 for NON-THRASHABLE NON- 
BLOCKING HIERARCHICAL CACHE filed concurrently 
herewith by Ricky C. Hetherington, Sharad Mehrotra and 
Ramesh Panwar; Ser. No. 08/881,065 identified as Docket 
No. P2408/37178.830087.000 for IN-LINE BANK CON- 

30 FLICT DETECTION AND RESOLUTION IN A MULTI- 
PORTED NON-BLOCKING CACHE filed concurrently 
herewith by Ricky C. Hetherington, Sharad Mehrotra and 
Ramesh Panwar; and Ser. No, 08/882,613 identified as 
Docket No. P2434/37178. 830088. 000 for SYSTEM FOR 

35 THERMAL OVERLOAD DETECTION AND PREVEN- 
TION FOR AN INTEGRATED CIRCUIT PROCESSOR 
filed concurrently herewith by Ricky C. Hetherington and 
Ramesh Panwar, the disclosures of which applications are 
herein incorporated by this reference. 
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BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates in general to microproces- 
45 sors and, more particularly, to a system, method, and pro- 
cessor architecture for dynamically reconfiguring a proces- 
sor between uniprocessor and selected multiprocessor 
configurations. 

2. Relevant Background 

50 Early computer processors (also called microprocessors) 
included a central processing unit or instruction execution 
unit that executed only one instruction at a time. As used 
herein the term processor includes complete instruction set 
computers (CISC), educed instruction set computers (RISC) 

55 and hybrids. The processor executes programs having 
instructions stored in main memory by fetching their 
instruction, decoding them, and executing them one after the 
other. In response to the need for improved performance 
several techniques have been used to extend the capabilities 

60 of these early processors including pipelining, 
superpipelining, superscaling, speculative instruction 
execution, and out-of-order instruction execution. 

Pipelined architectures break the execution of instructions 
into a number of stages where each stage corresponds to one 

65 step in the execution of the instruction. Pipelined designs 
increase the rate at which instructions can be executed by 
allowing a new instruction to begin execution before a 
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previous instruction is finished executing. Pipelined archi- a single CPU so that the computer power of the overall 

tectures have been extended to "superpipelined" or system is enhanced. However, because conventional multi- 

" extended pipeline" architectures where each execution processors are implemented using a plurality of discrete 

pipeline is broken down into even smaller stages (i.e., integrated circuits, communication between the devices lim- 

microinstruction granularity is increased). Superpipelining 5 its system clock frequency and the ability to share resources 

increases the number of instructions that can be executed in amongst the plurality of processors. As a result, conven- 

the pipeline at any given time. tional multiprocessor architectures result in duplication of 

"Superscalar" processors generally refer to a class of resources which increases cost and complexity. 

microprocessor architectures that include multiple pipelines Given the wide variety and mix of software used on 

that process instructions in parallel. Superscalar processors 10 general purpose processors, it often occurs that some pro- 

typically execute more than one instruction per clock cycle, grams run most efficiently on superscalar, superpipeline 

on average. Superscalar processors allow parallel instruction uniprocessors while other programs run most efficiently in a 

execution in two or more instruction execution pipelines. multiprocessor environment, Moreover, the more efficient 

The number of instructions that may be processed is architecture may change over time depending on the mix of 

increased due to parallel execution. Each of the execution 15 programs running at any given time. Because the architec- 

pipelines may have differing number of stages. Some of the ture was defined by the CPU manufacturer and system board 

pipelines may be optimized for specialized functions such as producer, end users and programmers had little or no ability 

integer operations or floating point operations, and in some to configure the architecture to most efficiently use the 

cases execution pipelines are optimized for processing hardware resources to accomplish a given set of tasks. 

graphic, multimedia, or complex math instructions. 20 SUMMARY OF THE INVENTION 

The goal of superscalar and superpipeline processors, is to _ . _ 4 t , . . . . . . c 

, u . , . t , nnr>\ i * *■ Briefly stated, the present invention involves a method of 

execute multiple instructions per cycle (IPC). Instruction- . J , j , / , . „ c , , 

, , n J /1T m / u / v 7 ... . executing coded instructions in a dynamically configurable 

level parallelism (ILP) available in programs written to i.. u * u j • i j- 

/ t , v ' , , . tU . . multiprocessor having shared execution resources including 

operate on the processor can be exploited to realize this goal. r - . „ & A . . .. 

However, many programs are not coded in a manner thaTcan 25 * te P s of a 1( first P rocess ? r ,n an act ' ve state u P on 

take full advantage of deep, wide instruction execution boo, , ln 8 of the multtprocessor. In response to a processor 

• j \m r * u i create command, a second processor is placed in an active 

pipelines in modern processors. Many factors such as low ^ A j 

. ... a a e * state. When either the first or second processor encounter a 

cache hit percentage, instruction interdependency, frequent L tLiLi u • i if «• u- u.u 

t , • u i j .i i -I tU J n cache miss that has to be serviced by off-chip cache the 

access to slow peripherals, and the like cause the resources ... . / , \ . , 

f i \ i_ a ' ccz * *i 30 processor requiring service is placed in nap state in which 

of a superscalar processor to be used inefficiently. K . r\ t_- r * . r ,. , 

r r . , ,. instruction fetching for that processor is disabled. When 

Superscalar architectures require that instructions be dis- eithef (he fifSt Qr secQnd ocessor encounter a cache miss 

patched for execution at a sufficient rate. Conditional ^ has tQ be sefviced by mam memorVj the proceS sor 

branching instructions create a problem for instruction fetch- requiring services x placed in a sleep state by flushing aU 

ing because the instruction fetch unit (IFU) cannot know 35 mslructions from the proce ssor in the sleep state and dis- 

with certainty which instructions to fetch until the condi- aWin mslruclion fetching for the proC essor in the sleep 

tional branch instruction is resolved. Also, when a branch is state 

detected, the target address of the instructions following the A ' • n „ MAnn „ a , - t u iUo nraco „ t • 

.*. & J . Ji • * c A processor in accordance with the present invention 

branch must be predicted to supply those instructions tor . . V .... . 

F w j mcludes a processor creation unit responsive to a processor 
execution. 4Q create command to output signals indicating a current pro- 
Recent processor architectures use a branch prediction cessor configuration and plurality of virlua i or i ogica i pr0 . 
unit to predict the outcome of branch instructions allowing ceS sors each virtual processor having a first set of execution 
the fetch unit to fetch subsequent instructions according to resources that are uniquely identified with the virtual pro- 
the predicted outcome. These instructions are "speculatively cessQr and a second se , of execution ^sources that are 
executed" to allow the processor to make forward progress 45 shared amongst the p i ura ii ty 0 f virtual processors. A plural- 
dunng the time the branch instruction is resolved. ity of state machines reS ponsive to the processor creation 
Another solution to increased processing power is pro- um t are provided, each corresponding to a selected one of 
vided by multiprocessing. Multiprocessing is a hardware me plurality of virtual processors. The state machines mainl- 
and operating system feature that allows multiple processors r tain processor status information representative of whether 1 
to work together to share workload within a computing 50 W processor is available to receive and execute instructions, 
system. In a shared memory multiprocessing system, all The processor further includes status logic analyzing 
processors have access to the same physical memory. One 'expected latency of instructions on each processor a~nd 
limitation of multiprocessing is that programs that have not updating the state.machine corresponding to any processor 
been optimized to run as multiple process may not realize ^havinf an^instruction with an expected latency greater than 
significant performance gain from multiple processors. 55 a preselectedlhrcshoid. 7 

However, improved performance is achieved where the ^ foregoing and other feat ur e s, utilities and advantages 

operating system is able to run multiple programs of the invention will be apparent from the following more 

concurrently, each running on a separate processor. particular description of a preferred embodiment of the 

Multithreaded software is a recent development that invention as illustrated in the accompanying drawings. 

allows applications to be split into multiple independent 60 „ 

threads such that each thread can be assigned to a separate DESCRIPTION OF THE DRAWINGS 

processor and executed independently parallel as if it were FIG. 1 shows in block diagram form a computer system 

a separate program. The results of these separate threads are incorporating an apparatus and system in accordance with 

reassembled to produce a final result. By implementing each the present invention; 

thread on a separate processor, multiple tasks are handled in 65 FIG. 2 shows a processor in block diagram form incor- 

a fast, efficient manner. The use of multiple processors porating the apparatus and method in accordance with the 

allows various tasks or functions to be handled by other than present invention; 
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FIG. 3 illustrates a processor create unit in accordance superscalar processor 102 shown in block diagram form in 

with the present invention; FIG. 1 and FIG. 2. The particular examples represent 

FIG. 4 shows a portion of the processor create unit of FIG. implementations useful in high clock frequency operation 

3 in greater detail; and processors that issue and executing multiple instructions 

FIG. 5 shows an instruction fetch unit in accordance with 5 P er c y cle ( IPC )* However, it is expressly understood that the 

the present invention in block diagram form; inventive features of the present invention may be usefully 

RG. 6 illustrates an example format for a branch repair h al * mative P [ OCCSSOr 

table used in the fetch unit of FIG. 3; ! hat ™» be f fit fr ° m J he performance features of the present 

m „ „ .„ , . „ . invention. Accordingly, these alternative embodiments are 

RG. 7 illustrates an example instruct,™ bundle in accor- J0 ivalent t0 the particu i ar embodiments shown and 

dance with an embodiment or the present invention; described herein 

FIG. 8 shows in block diagram form an instruction CTr * . 4 ■ i i 4 

, , ,. , . „ T „ ° FIG. 1 shows a typical general purpose computer system 

scheduling unit shown in FIG, 2; inA . %m • i -.u 4 i_ 

& 100 incorporating a processor 102 in accordance with the 

FIG. 9 shows an exemplary entry in an instruction sched- presem inventiorj . Computer system 100 in accordance with 

uling window in accordance with the present invention; 35 tne present invention comprises an address/data bus 101 for 

FIG. 10 shows an exemplary instruction wait buffer used communicating information, processor 102 coupled with 
in conjunction with the instruction scheduling window bus 101 through input/output (I/O) device 103 for process- 
shown in FIG. 9; and ingdata and executing instructions, and memory system 104 

FIG. 11 shows in block diagram form instruction execu- coupled with bus 101 for storing information and instruc- 
tion units in accordance with an embodiment of the present 20 tions for processor 102. Memory system 104 comprises, for 
invention. example, cache memory 105 and main memory 107. Cache 

FIG. 12 shows in block diagram form a data cache (D$), memory 105 includes one or more levels of cache memory, 

data cache tag (D$TAG), and data cache translation looka- In a typical embodiment, processor 102, I/O device 103, and 

side buffer (DSTLB) connected to an L2 cache (L2$). some or all of cache memory 105 may be integrated in a 

25 single integrated circuit, although the specific components 

DETAILED DESCRIP HON OF FHE aD( j integration density are a matter of design choice selected 

PREFERRED EMBODIMENTS t o meet the needs of a particular application. 

The present invention recognizes the wide variation in User I/O devices 106 are coupled to bus 101 and are 

software (i.e., computer instruction code) that must be operative to communicate information in appropriately 

accommodated by a general purpose processor. Some code 30 structured form to and from the other parts of computer 100. 

is most efficiently executed on a single high-speed processor User I/O devices may include a keyboard, mouse, card 

with multiple deep pipelines. However, some applications reader, magnetic or paper tape, magnetic disk, optical disk, 

cannot take advantage of these processor architectures. or other available input devices, include another computer. 

Also, older software that was written before superscalar 35 Mass storage device 117 is coupled to bus 101 and is 
processors were common may not be optimized to take implemented using one or more magnetic hard disks, mag- 
advantage of the benefits of multiple pipeline execution. Detic tapes, CDROMs, large banks of random access 
Further, many applications now use multithreading software memory, or the like. A wide variety of random access and 
techniques that are best implemented on a multiprocessor read only memory technologies are available and are equiva- 
plalform rather than a single processor platform. The 4Q lent for purposes of the present invention. Mass storage 117 
method, processor, and computer system in accordance with ma y include computer programs and data stored therein, 
the present invention allows the processor hardware to be Some or a11 of mass storage 117 may be configured to be 
dynamically configured to meet the needs of a particular incorporated as a part of memory system 104. 
software application. In a typical computer system 100, processor 102, I/O 

In such a dynamically configurable multiprocessor, 4S device 103, memory system 104, and mass storage device 

however, many execution resources may be shared among 117, are coupled to bus 101 formed on a printed circuit board 

the multiple virtual or logical processors on a single inte- and integrated into a single housing as suggested by the 

grated circuit chip. Fairness issues may arise between the dashed-line box 108. However, the particular components 

processes if, for example, one process misses in the cache chosen to be integrated into a single housing is based upon 

more frequently than the others. In this case, the process that 50 market and design choices. Accordingly, it is expressly 

misses in the cache occupies space in the shared resources understood that fewer or more devices may be incorporated 

without doing any useful work while the cache miss is within the housing suggested by dashed line 108. 

serviced by higher cache levels or main memory. The Display device 109 is used to display messages, data, a 

present invention recognizes this fairness issue with a solu- graphical or command line user interface, or other commu- 

tion for allocating resources fairly amongst the active pro- 55 nications with the user. Display device 109 may be 

cesses. implemented, for example, by a cathode ray tube (CRT) 

Computer systems and processor architectures can be monitor, liquid crystal display (LCD) or any available 

represented as a collection of interacting functional units as equivalent. 

shown in FIG. 1 and FIG. 2. These functional units, dis- FIG. 2 illustrates principle components of processor 102 

cussed in greater detail below, perform the functions of eo in greater detail in block diagram form. It is contemplated 

storing instruction code, fetching instructions and data from that processor 102 may be implemented with more or fewer 

memory, preprocessing fetched instructions, scheduling functional units and still benefit from the apparatus and 

instructions to be executed, executing the instructions, man- methods of the present invention unless expressly specified 

aging memory transactions, and interfacing with external herein. Also, functional units are identified using a precise 

circuitry and devices. 65 nomenclature for ease of description and understanding, but 

The present invention is described in terms of apparatus other nomenclature often is often used by various manufac- 

and methods particularly useful in a superpipelined and hirers to identify equivalent functional units. 
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Unlike conventional multiprocessor architectures, the (^create command^ Optionally, a processor destroy command 

^pjescnt.inyention may_be, and desirablyjs, jmplemented as canalso be provided to move a state machine 301 from an 
a single-rircuirorTa" single integrated circuit cfiipTItPtrris^ active state to a dead state. Desirably, each state machine 

manner— the individual processors -are not -only closely 301 includes a "nap" state that can be reached from the 

coupled, but are in essence merged such that they can share 5 ? active state, and a "sleep" state that can be reached from the 

resources efficiently amongst the processors. This resources nap state active state can be reacne d f rom tne dead, nap, 

sharing simplifies many of the communication overhead or s i eep states as snown in FIG. 4. 

problems inherent in other multiprocessor designs. For . . . . . . . . , 

example, memory, including all levels of the cache In a P artlcular ^plementation, a virtual processor ir i the 

subsystem, are easily shared among the processor and so in ^tive state is assigned exclusive control over some of the 

cache coherency is not an issue. Although the resources are 10 £? red resou f rce u s in the functional units of processor 102. 

shared, the multiprocessor configuration in accordance with When one of the virtual processors experiences a delay in 

the present invention achieves the same advantages- as e * ecu ' m & ^tructions, that delay preferably does not affect 

conventional multiprocessing architecturerbfenablin ; g the olher virtual Processors. For example, when one virtual 

indepeHdemlhTeads ancf proc esses to execu te 1 independently; „ Processor experiences an on-chip cache miss, it will require 

and in parallel ~~ _ ~" ' tens 0 cycles to obtain the required data from the 

T . iL iL \- * off-chip cache. When an off-chip cache miss occurs and data 

In accordance with the present invention, processor create m . .f * * a c • 

™ n • ij. • . must be retrieved from mam memory, or mass storage, 

unit 200 is coupled to receive a processor create instruction ujjtii i L r L . 

P iL r r , . 4 . hundreds of clock cycles may occur before that process can 

from either the computer operating system, a running make forward pr0Kress 

application, or through a hardware control line (not shown). 2 n re- 
in a specific example, the processor create instruction is ^ na P and slee P states in state machines 301 are 
added to the SPARC V9 instruction architecture as a privi- provided to account for these delays. When a virtual pro- 
leged command that can be issued only by the operating cessor encounters an on-chip cache miss it is placed in a nap 
system. The processor create instruction instructs processor state - ^ oa P state disables instruction fetching only for the 
102 to reconfigure as either a uniprocessor or as one of an 2 , virtual processor m the nap state. Instruction fetching con- 
available number of multiprocessor configurations by sped- tinues for the remaining virtual processors. In the nap state, 
fying a number of virtual processors. In a specific example, instruction scheduling and execution remain enabled 
one virtual processor is created for each thread or process in (described in greater detail hereinbelow). Hence, in the nap 
the instruction code. In this manner, when it is determined state a processor is allowed to continue possession of 
by the operating system, application, or otherwise that the 30 execution resources that it has already occupied, but is not 
current instruction code can be executed more efficiently in allowed to take possession of any more resources so that 
a multiprocessor of n-processors, the processor create other virtual processors may use these resources, 
instruction is used to instantiate n virtual processors to When a napping virtual processor encounters a cache miss 
execute the code. The configuration may change dynami- that must be satisfied by main memory, or mass storage, the 
cally in response to new applications starting or a running 35 virtual processor enters the sleep state. In the sleep state, all 
application spawning a new thread. instructions belonging to the sleeping virtual processor are 
The term "virtual processors" is used herein to describe flushed from ISU 206. Hence, not only is the sleeping 
the functional operation of the dynamically configurable processor prevented from taking additional resources, but it 
processor and method in accordance with the present inven- is aIso forced to release resources previously occupied so 
tion. Each virtual processor is a logical processor as opposed 40 that olher virtual processors may continue execution unim- 
to a physically implemented processor. As described in Paired. The sleep state prevents instructions from the sleep- 
greater detail below, each virtual processor requires a set of ing virtual processor from clogging up ISU 206 and thereby 
execution resources that are unique to that processor. These interfering with execution of instructions from other virtual 
unique resources are enabled in response to the processor processors. 

create command to activate a virtual processor. Also, each 45 Instruction fetch unit (IFU) 202 (shown in greater detail 

virtual processor requires access to a set of shared execution in FIG. 5) comprises instruction fetch mechanisms and 

resources. These shared resources are enabled independently includes, among other things, an instruction cache 1$ for 

of the processor create command. In accordance with the storing instructions, branch prediction logic 501, and 

present invention, when a virtual processor is activated, the address logic for addressing selected instructions in instruc- 

processor behaves as if it is the selected uniprocessor or 50 tion cache 1$. The instruction cache IS is a portion of the 

multiprocessor, however, no physical reconfiguration, level one (LI) cache with another portion (D$, not shown) 

rewiring, or the like is required. of the LI cache dedicated to data storage in a Harvard 

Referring to FIG. 3 and FIG. 4, processor creation unit architecture cache. Other cache organizations are known, 

200 may be implemented as a plurality of state machines including unified cache structures, and may be equivalently 

301. In the example of FIG. 3, one state machine 301 is 55 substituted and such substitutions will result in predictable 

provided for each virtual processor. Any number of state performance impact. 

machines 301, hence any number of virtual processors, may IFU 202 fetches one or more instructions each clock cycle 

be included in processor 102. One of the state machines 301 by appropriately addressing the instruction cache 1$ via 

is designated as a primary unit that is analogous to a boot MUX 503 and MUX 305 under control of branch logic 501 

processor in a conventional multiprocessor design. The 60 as shown in FIG. 5. In the absence of a conditional branch 

primary state machine 301 will become active automatically instruction, IFU 202 addresses the instruction cache sequen- 

when processor 102 is activated, while the other state tially. Fetched instructions are passed to IRU 204 shown in 

machines 301 wait to respond to the processor create com- FIG. 2. Any fetch bundle may include multiple control-flow 

mand to become activated. (i.e., conditional or unconditional branch) instructions. 

At a minimum, each state machine comprises a "dead" or 65 Hence, IFU 202 desirably bases the next fetch address 

inactive state^and active" or active_state. ri nie~tran sition decision upon the simultaneously predicted outcomes of 

between dead-and active : states is control led by the processor multiple branch instructions. 
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The branch prediction logic 501 (shown in FIG. 5) branch instruction. The BHR VALUE and BHT VALUE 

handles branch instructions, including unconditional fields store the value of the BHR and BHT, respectively, at 

branches. An outcome for each branch instruction is pre- the time a branch instruction was predicted, 

dieted using any of a variety of available branch prediction ^ branch ^ , ab , e (BHT) 519 a pluralily 

algorithms and mechanisms. In the example of FIG. 5, an 5 of values Mofe tha[) two . bi , yalues be used b 

exclusive-OR operation is performed on the current address b(e & achJeved ^ ^ ^ M9 ^ 

and a value from a selected branch history register (BHR) to ■ , JL , . r .i_ i . Jri rm 

generate an index to the branch history table (BHT) 519. To ^txed by computing an excluswe-or of the selected BHR 

implement a multiprocessor in accordance with the present "t ™ curren l fetch addres ? ***** ^m the output of 

invention, each virtual processor has a unique BHR. For a 1(1 MUX 503. In a specific example the 17 least sign meant bits 

four processor implementation shown in FIG. 5, four BHR 10 ° f «nem address are used in the XOR computation 

inputs labeled BHR_0, BHR_1, BHR_2, and BHR_3 are (^eluding the two most-least significant bits which are 

provided always 0 s in a byte addressed processor with 32-bit 

r u ' mm • c .• i. ..u . instructions) to match the 17 bit values in each BHR. The 

Each active BHR comprises information about the out- „„„ ... . , -, . • . . . , 

comes of a preselected number of most-recently executed 15 X ° R C °^fTTfnZ? , ' k ,* f ° M 

«... j i_ ■ * r entry in BHT. The 17 bit index enables selection from up to 

condition and unconditional branch instructions for a par- - 17 in0 T^ i *• • nu^cm ^ hut r*n u 

, c • *, i * *u 2 or 128K locations in BHT 519. One BHT 519 may be 

ticular active virtual processor. For virtual processors in the , , , c . . , J 

j j . * *u mm i • j a * shared among any number or virtual processors. 

dead state, the BHR value is a don t care. An outcome can & J v 

be represented in binary as taken or not taken. In a specific 0nce a branch 15 resolved, the address of the path this 

example, each active BHR comprises a seventeen-bit value 20 branch actually follows is communicated from IEU 208 and 

representing the outcomes of seventeen most-recently compared against the predicted path address store in the BT 

executed branch instructions ADDRESS fields. If these two addresses differ, those 

Processor create unit 200 selects one active BHR using instructions down the mispredicted path are flushed from the 

multiplexor 517. Only one BHR is selected at a time, and Processor and IFU 202 redirects instruction fetch down the 

processor create unit 200 serves to select the BHR in a 25 ™ f rect path identified m the BNT ADDRESS field using the 

round-robin fashion each clock cycle from the virtual pro- B * T m P ut *° M J UX 505 ' 0n * a braDCD 1S resolvcd ' the 

cessors that are in an active stale. Hence, if only one value ^ updated using the BHT index and BH I value stored 

processor is active, only BHR_0 will be selected. Each ' n BRT 515 ' ,n lhe exam P le of ™- 5 - each BH ^ 

BHR comprises the outcomes (i.e., taken or not taken) for a 519 15 f iv *°' b > 1 saturating counter. When a predicted branch 

number of most-recently executed conditional and uncon- 30 I s ^solved taken, the entry used to predict this outcome is 

ditional branch instructions occurring on a processor-by- mcremented^en a predicted branch is resolved not taken, 

processor basis. In a specific example, each BHR comprises the cn , trv ! n . 5 } 9 " ' decremented u Other branch predic- 

a 17-bit value. When a conditional branch instruction is tion algonthms and techniques may be used in accordance 

predicted, the predicted outcome is used to speculatively ™ ll V the P rcsent invention, so long as care is taken to 

update the appropriate BHR so that the outcome will be a 35 du P hcate resources on a j Processor-by-processor basis where 

part of the information used by the next BHT access for that thoseresources are used exclusively by a given processor, 

virtual processor. When a branch is mispredicted, however, y*^Imough^h^fie4ds4n-BRT-515-may"inaude a thread"^) 

the appropriate BHR must be repaired by transferring the P dentifier fie1 ^ to indicate which virtual processor executed^ 

BHR VALUE from BRT 515, along actual outcome of the Cjhe branch^instTu^ion-^ BRT 515 is / 

mispredicted branch are loaded into the BHR corresponding ^shared among all of the virtual processors and requires little "> 

to the virtual processor on which the branch instruction modification to support dynamically configurable unipro- J 
occurred. r - cessing and multiprocessing in accordance -with-the-present^r 

Next fetch address table (NFAT) 513 determines the next - invention. __ — - J 

fetch address based upon the current fetch address received Another resource in IFU 202 that must be duplicated for 

from the output of MUX 503. For example, NFAT 513 may 45 cacn virtual processor is the return address stack (RAS) 

comprise 2048 entries, each of which comprises two multi- . labeled RAS„0 through RAS__3 in FIG. 5. Each RAS 

bit values corresponding to a predicted next fetch address for comprises a last in, first out (LIFO) stack in a particular 

instructions in two halves of the current fetch bundle. Two example that stores the return addresses of a number of 

bits of the multi-bit values comprise set prediction for the most-recently executed branch and link instructions. These 

next fetch, while the remaining bits are used to index the 50 instructions imply a subsequent RETURN instruction that 

instruction cache IS and provide a cache line offset in a will redirect processing back to a point just after the fetch 

specific implementation. address when the branch or link instruction occurred. When 

A branch repair table (BRT) 515 comprises entries or slots an instruction implying a subsequent RETURN (e.g., a 
for a number of unresolved branch instructions. BRT 515 CALL or JMPL instruction in the SPARC V9 architecture) 
determines when a branch is mispredicted based upon input 55 ^ executed, the current program counter is pushed onto a 
from IEU 208, for example. BRT 515, operating through selected one of RAS_0 through RAS_3. The RAS must be 
branch logic 501, redirects IFU 202 down the correct branch maintained on a proccssor-by-proccssor (i.e., thread-by- 
path. Each entry in BRT 515 comprises multiple fields as thread) basis to ensure return to the proper location, 
detailed in FIG. 6. Branch taken fields (i.e., BT When a subsequent RETURN instruction is executed, the 
ADDRESS_1 through BT AD DRESS„N) store an address 60 program counter value on top of the RAS is popped and 
(i.e., program counter value) for the first fetch bundle in the selected by appropriately controlling multiplexor 505 in 
branch instructions predicted path. Branch not taken fields FIG. 5. This causes IFU 202 to begin fetching at the 
(i.e., BNT ADDRESS_1 through BNT ADDRESS_N) RAS-specified address. The RETURN instruction is allo- 
store an address for the first fetch bundle in a path not taken cated an entry in BRT 515 and the fall-through address is 
by the branch instruction. A branch history table (BHT) 65 stored in the BNT ADDRESS field for that entry. If this 
index (BHT 1NDEX_1-BHT INDEX_N) points to a loca- RETURN instruction is mispredicted, it is extremely 
tion in the branch history table that was used to predict the unlikely that the fall-through path is the path the RETURN 
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should follow and I TO 202 must be redirected via an address 
computed by IEU 208 and applied to the IEU input to 
multiplexor 505. 

IFU 202 includes instruction marker circuitry 507 for 
analyzing the fetched instructions to determine^selected 
information about the instructions. Markepunit J 507TS^also 
coupled to processor_create'llmcr200.j[his selected 
irifomationrincluding^fe 
^.virtud'processpHdentifi generated^by processor cre- 
mate unit 200,js - re_ferred to herein as "instruction meladata*\ 
^In accordance with the present invention, each fetch" bundle" 
^is tagged with a thread identification-for use by downstream 
functioriaT uliiU7~Other^metadata~com prises information 
about, for example, instruction complexity and downstream 
resources that are required jo„executejhe instruction. The 
term^ejeejeu^^ 

space, rename, register space,. tab leuspace, decoding stage 
resources, and the like that must be committed within 
processor 102 to execute-theihs truction. The met adaU can 
C;be^g§n^ratedZby--processor-create unit 200 or dedicated 
combinatorial logic that-outputs th~e~meta~dataTn response to 
the^instrucjwn pp-code input. Alternatively, a look-up table 
or content addressable memory can be used to obtain the 
metadata. In a typical application, the instruction metadata 
will comprise two to eight bits of information that is 
associated with each instruction. 

In many applications it is desirable to fetch multiple 
instructions at one time. For example, four, eight, or more 
instructions may be fetched simultaneously in a bundle. In 
accordance with the present invention, each _ instruct ion 
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below, some or all of the downstream functional units have 
resources that may be effectively shared among multipro- 
cessors in accordance with the present invention. A signifi- 
cant advantage in accordance with the present invention is 
j that the downstream functional units do not require complete 
duplication to enable multiprocessor functionality. Another 
advantage is that several functional units include resources 
that-can-be dynamically-shared thereby enabling "on-the- 
^fiY -reconfiguration fromXuliiprocessor mode to any of a 
number of "multiprocessor modes. — - 

IRU 204, comprises one or more pipeline stages that 
include instruction renaming and dependency checking 
mechanisms. A feature of the present invention is that 
inter-bundle dependency checking is relaxed because 
bundles from different threads are inherently independent. 
IRU 204 implements necessary logic for handling rename 
registers in a register window-type architecture such as the 
SPARC- V9 instruction architecture. A dependency checking 
mechanism, called an inverse map table (IMT) or depen- 
dency checking table (DCT) in a specific example, is used 
to analyze the instructions to determine if the operands 
(identified by the instructions' register specifiers) cannot be 
determined until another live instruction has completed. A 
particular embodiment of an IMT is described in greater 
detail in U.S. patent application Ser. No. 08/882,173 titled 
"ASTRUCTION AND MECHANISM FOR ENFORCING 
TRUE DEPENDENCIES IN AN OUT OF ORDER 
MACHINE" by Ramesh Panwar and Dani Y Dakhil filed 
concurrently herewith, is operative to map register specifiers 
in the instructions to physical register locations and to 
perform register renaming to prevent dependencies. IRU 



^bundle^includes t^iratfuctio^^ outputs renamed instructions to instruction scheduling 

^ID) as.showriln instruction b undle-700 "s hown in FIG. 7. 37 unit (ISU) 206. 
.10-17 -represent conventional, instruction .fields that ^ 



— comprise, for example, ^ an op-code, one or more operand or 
source register specifiers (typically denoted rsl, rs2, rs3, 
etc.) and a destination register specifier (typically denoted 
rd) and/or condition code specifiers. Other information, 
including instruction metadata, may be included in each 
10-17 field. As shown in FIG. 7, the instruction metadata for 
an entire bundle 700 may be grouped in a single field labeled 
THREAD ID in FIG. 7. Alternatively, the instruction meta- 
data may be distributed throughout the 10-17 instruction 
fields. 

Although IFU 208 supporting dynamically configurable 
multiprocessing in accordance with the present invention 
has been described in terms of a specific processor capable 
of implementing one, two, three, or four virtual processors 
in a single processor unit, it should be appreciated that 
n-way multithreading can be achieved by modifying IFU 
208 to fetch instructions from n different streams or threads 
on a round-robin or thread-by-thread basis each cycle. 
Because each fetch bundle includes instructions from only 
one thread, the modifications required to support dynami- 
cally configurable multithreading can be implemented with 
modest increase in hardware size and complexity. 
Essentially, any state information that needs to be tracked on 
a per-processor or per-thread basis must be duplicated. Other 
resources and information can be shared^amongst the virtual 
processors^Fbe^BHR tracks branch outcomes within a single 
thread of execution so there should be one copy of the BHR 
; for eac hj hread . ,S im ilarly, jhe 7 RASriracksj^turaTddresses 
for ajsingle thread'otexecution and so there.should be one 
x copy of the RAS for each thread. 

The remaining functional units shown in FIG. 2 are 
referred to herein as "downstream" functional units although 
instructions and data flow bidirectionally between the 
remaining functional units. As described in greater detail 
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Each entry compares the source fields (rsl and rs2) of all 
eight incoming instructions against the destination register 
field for that entry. If there is a match, the entry broadcasts 
its own address on to the corresponding bus through a simple 
encoder. This broadcast address is referred to as a producer 
ID (PID) and is used by the instruction scheduling window 
(ISW) within instruction scheduling unit 206 to determine 
the ready status of waiting instructions. A match also takes 
place between the CC fields of the eight incoming instruc- 
tions and the CC field of the entry. 

When a branch instruction is resolved and its predicted 
direction turns out to be wrong, the prefetched instructions 
following it (within the same thread or virtual processor) 
must be flushed from the ISW. Fetching into the window 
must resume at the position following the mispredicted 
branch, as described hereinbefore with respect to IFU 202. 
However, instructions being flushed may have been taken 
over as being youngest producers of certain registers in the 
machine. There are two ways to handle this situation. One, 
resume fetching into the window but prevent scheduling of 
the new instructions until all of the previous instructions 
have retired from the window. Alternatively, rewind the 
youngest-producer information within the dependency 
checking table so the older instructions are reactivated as 
appropriate. 

Each entry in the ISW is tagged with a two-bit thread ID 
to identify the thread to which the instruction belongs. On a 
flush, the ISW entries belonging to only the thread that 
suffered the branch mispredict are eliminated while the 
entries corresponding to the other threads stay resident. 
Hence, the flush information that is broadcast by IEU 208 
has to contain the thread identifier of the mispredicted 
branch. 

IRU 204 further comprises a window repair table (WRT) 
operative to store status information about register window 
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instruetions~i^d~tO-restor€^e£state_ofl^^ter-windpws 
after a branch misprediction. The WRT includes thirty-two 
entries or slots, each entry comprising one or more fields of 
information in a particular example. The number of entries 
in the WRT may be more or less depending on the needs of 5 
a particular application. The WRT can be shared amongst the 
multiprocessors in accordance with the present invention 
and does not require modification. The WRT would not be 
necessary in a processor that does not use register widows. 

ISU 206 (shown in greater detail in FIG. 8) is operative 10 
to schedule and dispatch instructions as soon as their depen- 
dencies have been satisfied into an appropriate execution 
unit (e.g., integer execution unitJIEU) 208, or floating point 
and g raphics un it (FGU}_210) • i JSU r 2p'6 also "maintains trap 
status ofjive instructions. ISU 2^6 may^perform other 1 15 
J^funcfib^such as maintaining the correct architectural state 
^of processor _102rincluding state maintenance when out-of- 
order instruction processing is used. ISU 206 may include 
mechanisms to redirect execution appropriately when traps 
or interrupts occur and to ensure efficient execution of 20 
multiple threads where multiple threaded operation is used. 
Multiple thread operation means that processor 102 is run- 
ning multiple substantially independent processes simulta- 
neously. Multiple thread operation is consistent with but not 
required by the present invention. 25 

In accordance with an embodiment of the present 
invention, state machines 301 are implemented in ISU 206 
by maintaining virtual processor status information in ISU 
206. Although other functional units use the thread ID to 
implement multiprocessors in accordance with the present 30 
invention, ISU 206 uses the virtual processor status infor- 
mation to implement the active, nap, and sleep states 
described hereinbefore. Hence, to ease circuit complexity 
and improve operation speed, it is advantageous to imple- 
ment state machines 301 in ISU 206. 35 

ISU 206 also operates to retire executed instructions when 
completed by IEU 208 and FGU 210. ISU assigns each live 
instruction a position or slot in an instruction retirement 
window (IRW) shown in FIG. 8. In a specific embodiment, 
the IRW includes one slot for every live instruction. ISU 206 40 
directs the appropriate updates to architectural register files 
and condition code registers upon complete execution of an 
instruction. ISU 206 is responsive to exception conditions 
and discards or flushes operations being performed on 
instructions subsequent to an instruction generating an 
exception. ISU 206 quickly removes instructions from a 
mispredicted branch and instructs IFU 202 to fetch from the 
correct branch. An instruction is retired when it has finished 
execution and all instructions from which it depends have 
completed. Upon retirement the instruction's result is writ- 50 
ten into the appropriate register file and is no longer deemed 
a "live instruction". 

In operation, ISU 206 receives renamed instructions from 
IRU 204 and registers them for execution by assigning each 
instruction a position or slot in an instruction scheduling 55 
window (ISW). In a specific embodiment, the ISW includes 
one slot 900 (shown in FIG. 9) for every live instruction. 
Each entry 900 in the ISW is associated with an entry 1000 
in an instruction wait buffer (IWB) shown in FIG. 10 by an 
IWB POINTER. In accordance with the present invention, 
each entry 900 includes a THREAD ID field holding the 
thread identification. Dependency information about the 
instruction is encoded in the PID fields of ISW entry 900. 
Metadata such as an instruction identification, ready status, 
and latency information, for example, are stored in a META- 
DATA field of each entry 900. Status information, included 
virtual processor status, is stored in the STATUS field ISW 
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entry 900. In a particular example, each STATUS field 
includes three bits to indicate status (e.g., active, dead, nap, 
sleep) for each instruction. State machines 301 are imple- 
mented by control logic that updates these status bits. 

One or more instruction picker devices such as 802a and 
802b in FIG. 8 pick instructions from the ISW that are ready 
for execution by generating appropriate signals on word 
lines 806 to the instruction wait buffer (IWB) so that the 
instruction will be read out or issued to execution units such 
as IEU 208 and FGU 210 in FIG. 2. Pickers 802a and 8026 
desirably base the decision of which instructions to pick 
upon the instruction's relative age as well (e.g., how long the 
instruction has been in ISU 206). 

The instruction is issued to IEU 208 or FGU 210 together 
with the thread identification and instruction identification 
so that IEU 208 or FGU 210 can respond back with the trap 
and completion status on an instruction-by-instruction basis. 
When the trap and completion status of an instruction arrives 
from IEU 208 or FGU 210, they are written into an instruc- 
tion retirement window (IRW) shown in FIG. 2. Retirement 
logic examines contiguous entries in the IRW and retires 
them in order to ensure proper architectural state update. 

In addition to retirement, one or more instructions can be 
removed from the execution pipelines by pipeline flushes in 
response to branch mispredictions, traps, and the like. In the 
case of a pipeline flush, the resources committed to the 
flushed instructions are released as in the case of retirement, 
but any speculative results or state changes caused by the 
flushed instructions are not committed to architectural reg- 
isters. In accordance with the present invention, a pipeline 
flush affects only instructions in a single thread or virtual 
processor, leaving other active virtual processors unaffected. 

IEU 208 includes one or more pipelines, each pipeline 
comprising one or more stages that implement integer 
instructions such as integer arithmetic units 1106 in FIG. 11. 
The integer arithmetic units 1106 are shared amongst the 
virtual processors in accordance with the present invention. 
IEU 208 also includes an integer result buffer (IRB) 1108 
that is shared amongst the virtual processors for holding the 
results and state of speculatively executed integer instruc- 
tions. IRB 1108 comprises a hardware-defined number of 
registers that represent another type of execution resource. 
In a specific example IRB 1108 comprises one register slot 
for each live instruction. 

IEU 208 functions to perform final decoding of integer 
instructions before they are executed on the execution units 
and to determine operand bypassing amongst instructions in 
an out-of-order processor. IEU 208 executes all integer 
instructions including determining correct virtual addresses 
for load/store instructions. IEU 208 also maintains correct 
architectural register state for a plurality of architectural 
integer registers in processor 102. IEU 208 preferably 
includes mechanisms to access single and/or double preci- 
sion architectural registers 1101. In accordance with the 
present invention, a copy of the integer architectural register 
files is provided for each virtual processor as shown in FIG. 
11. Similarly, a copy of the condition code architectural 
register files 1103 is provided for each virtual processor. 
Speculative results and condition codes in shared integer 
result buffer 1108 are transferred upon retirement to appro- 
priate architectural files 1101 and 1103 under control of 
retire logic 804. Because the architectural register files 1101 
and 1103 may be much smaller than integer result buffer 
1108, duplication of the architectural files on a processor- 
by-processor basis has limited impact on the overall size and 
complexity of the dynamically reconfigurable multiproces- 
sor in accordance with the present invention. 



01/28/2004, EAST version: 1.4.1 



6,035, 

15 

FGU 210, includes one or more pipelines, each compris- 
ing one or more stages that implement floating point instruc- 
tions such as floating point arithmetic units 1107 in FIG. 11. 
FGU 210 also includes a floating point results buffer (FRB) 
1109 for holding the results and state of speculatively 5 
executed floating point and graphic instructions. The FRB 
1109 comprises a hardware -de fined number of registers that 
represent another type of execution resource. In the specific 
example FRB 1109 comprises one register slot for each live 
instruction. FGU 210 functions to perform final decoding of 10 
floating point instructions before they are executed on the 
execution units and to determine operand bypassing 
amongst instructioas in an out-of-order processor. 

In a specific example, FGU 210 includes one or more 
pipelines (not shown) dedicated to implement special pur- 35 
pose multimedia and graphic instructions that are extensions 
to standard architectural instructions for a processor. FGU 
210 may be equivalently substituted with a floating point 
unit (FPU) in designs in which special purpose graphic and 
multimedia instructions are not used. FGU 210 preferably 20 
includes mechanisms to access single and/or double preci- 
sion architectural registers 1102 and condition code registers 
1104. Speculative results and condition codes in shared 
floating point result buffer 1109 are transferred upon retire- 
ment to appropriate architectural files 1102 and 1104 under 25 
control of retire logic 804. Each processor is provided with 
a unique set of architectural registers 1102 and 1104 to 
provide processor independence. 

Optionally, FGU 210 may include a graphics mapping 
table (GMT) comprising a fixed number of resources pri- 30 
marily or exclusively used for graphics operations. The 
GMT resources are typically used only for graphics instruc- 
tions and so will not be committed for each live instruction. 
In accordance with the present invention, the instruction 
metadata includes information about whether the fetched 35 
instruction requires GMT-type resources. The GMT 
resources may be shared amongst the virtual processors in 
accordance with the present invention. 

A data cache memory unit (DCU) 212, including cache 4Q 
memory 105 shown in FIG. 1, functions to cache memory 
reads from off-chip memory through external interface unit 
(EIU) 214 shown in FIG. 2. Optionally, DCU 212 also 
caches memory write transactions. DCU 212 comprises one 
or more hierarchical levels of cache memory and the asso- 45 
ciated logic to control the cache memory. One or more of the 
cache levels within DCU 212 may be read only memory to 
eliminate the logic associated with cache writes. 

In a specific implementation, DCU 212 comprises sepa- 
rate Instruction and Data caches in the LI cache, a unified 50 
level 2 cache (L2$) that is desirably formed on-chip, and an 
external level 3 cache (L3S). Details on the size, organiza- 
tion and operational policy are discussed herein to ease 
description, but it is expressly understood that a wide variety 
of cache and memory architectures can cooperate with and 55 
benefit from the apparatus and methods in accordance with 
the present invention. 

Each cache level has an inherently higher latency (i.e., 
time to return data). Latency is typically measured from the 
launch of the virtual address of a memory operation instruc- 60 
tion. The first level caches (1$ and D$) have the lowest 
latency in the range of a few clock cycles. L2 cache is next 
with a latency of two to ten times that of the LI cache. In the 
specific example L3 cache is an off-chip cache resulting in 
approximate latency of twenty-five to fifty clock cycles. In 65 
many designs, L2 cache is off-chip and so latency estimates 
would be adjusted accordingly. Latency to main memory is 
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approximately 100 clock cycles although this number can 
vary dramatically if some of main memory is serviced by 
swap files in mass storage 107 (shown in FIG. 1). For 
purposes of understanding the present invention, it is impor- 
tant only to know that each subsequent cache level results in 
increasing latency. 

Each cache level includes some device to detect whether 
the data requested by a program or by a lower level of cache 
exists in the cache level. When data exists in the cache a 
"hit" is generated and the data is returned to service the 
memory operation instruction. When data does not exist in 
the cache, a "miss" is generated and the data must be fetched 
from a higher cache level or main memory. In accordance 
with the present invention, DCU 212 is coupled to update 
state machines 301 such that a cache miss in the on-chip 
cache(s) results in a transition from an active state to a nap 
state. The nap state prohibits instructions from being fetched 
for the napping processor. Further, a cache miss that must be 
serviced from main memory (including service from mass 
storage), places the processor generating the cache request 
into the sleep state. The sleep state results in termination of 
instruction fetching and instruction execution. Instruction 
execution is halted by, for example, flushing all instructions 
tagged with the thread id corresponding to the sleeping 
process from instruction scheduling unit 206. By removing 
these instruction, pickers H02a and 802ft can move forward 
to pick instructions from active processes. 

It is contemplated that other instructions in addition to 
memory operations may result in latencies that can be 
handled by the state machine process in accordance with the 
present invention. For example, CISC machines may include 
instructions specifically adapted to access external 
peripherals, network resources, or the like. These instruc- 
tions may also suffer from long expected latency and so are 
desirably placed in a nap or sleep state to avoid clogging 
shared resources in a dynamically configurable multiproces- 
sor in accordance with the present invention. 

While the invention has been particularly shown and 
described with reference to a preferred embodiment thereof, 
it will be understood by those skills in the art that various 
other changes in the form and details may be made without 
departing from the spirit and scope of the invention. The 
various embodiments have been described using hardware 
examples, but the present invention can be readily imple- 
mented in software. For example, it is contemplated that a 
programmable logic device, hardware emulator, software 
simulator, or the like of sufficient complexity could imple- 
ment the present invention as a computer program product 
including a computer usable medium having computer read- 
able code embodied therein for dynamically configuring 
emulated or simulated processor. Accordingly, these and 
other variations are equivalent to the specific implementa- 
tions and embodiments described herein. 

What is claimed is: 

1. A method of executing coded instructions in a multi- 
processor having shared execution resources comprising the 
steps of: 

placing a first processor in an active state upon booting of 
the multiprocessor; 

in response to a processor create command, placing a 
second processor in an active state while the first 
processor remains in the active state; 

simultaneously executing instructions from each proces- 
sor in an active state using the shared execution 
resources; 

determining when either the first or second processor 
encounter a first cache miss that has to be serviced by 
off-chip cache; 
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in response to determining the first cache miss, placing the 
processor requiring service in nap state in which 
instruction fetching for that processor is disabled. 

2. The method of claim 1 further comprising the steps of: 
determining if the first cache miss is followed by a cache 

hit in the off-chip cache; and 
in response to determining the off-chip cache hit, placing 
the processor requiring service in the active state. 

3. The method of claim 1 further comprising steps of: 
determining when either the first or second processor 

encounter a experience a second cache miss that has to 
be serviced by main memory; and 
in response to determining the second cache miss, placing 
the processor requiring service in a sleep state in which 
instruction fetching and instruction execution for that 
processor are disabled. 

4. The method of claim 3 wherein the step of placing the 
processor requiring service in a sleep state further comprises 
flushing all instructions from the processor in the sleep state 
and disabling instruction fetching for the processor in the 
sleep state. 

5. The method of claim 3 further comprising the steps of: 
determining when the second cache miss is serviced by 

the main memory; and 
in response to determining that the second cache miss is 
serviced, placing the processor requiring service in the 
active state from the sleep state. 

6. The method of claim 1 wherein the coded instructions 
comprise instructions from a number of threads and each 
thread is executed on a separate one of the processors such 
that placing one processor in the nap state does not affect 
instruction fetching and execution of any other processor. 

7. The method of claim 3 further comprising the steps of: 
detecting when the first and the second cache misses are 

serviced; and 

placing the processor requiring service in an active state 
in response to the cache miss being serviced. 
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8. The method of claim 1 wherein the step of determining 
a first cache miss comprises comparing a tag portion of a 
physical address used as a target for a memory transaction 
to tag data for the cache location accessed by the target 
physical address. 

9. A method of executing coded instructions in a multi- 
processor having shared execution resources comprising the 
steps of: 

dynamically allocating a first portion of the shared 

resources to a first processor; 
dynamically allocating a second portion of the shared 

resources to a second processor; 
placing the first processor in an active state; 
placing the second processor in an active state while the 

first processor remains in the active state; 
simultaneously executing instructions for both the first 

and second processor using the shared execution 

resources; 

detecting when either the first or second processor 
encounters an instruction with a long expected latency; 

in response to detecting the instruction, placing the pro- 
cessor associated with the detected instruction in nap 
state in which instruction fetching for that processor is 
disabled. 

10. The method of claim 9 wherein the detected instruc- 
tion involves a cache miss. 

11. The method of claim 9 wherein the processor not 
associated with the detected instruction remains in the active 
state during the other processor's nap state. 

12. The method of claim 9 wherein the step of fetching 
comprises: 

identifying a thread to which the fetched instructions 
belong; and 

tagging each fetched instruction with a thread identifica- 
tion that associates the fetched instruction with either 
the first or second processor. 
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ABSTRACT 



A memory management unit is disclosed for a single-chip 
data processing circuit, such as a smart card. The memory 
management unit (i) partitions a homogeneous memory 
device to achieve heterogeneous memory characteristics for 
various regions of the memory device, and (ii) restricts 
access of installed applications executing in the micropro- 
cessor core to predetermined memory ranges. The memory 
management unit provides two operating modes for the 
processing circuit. In a secure kernel mode, the programmer 
can access all resources of the device including hardware 
control. In an application mode, the memory management 
unit translates the virtual memory address used by the 
software creator into the physical address allocated to the 
application by the operating system in a secure kernel mode 
during installation. The memory management unit imple- 
ments memory address checking using limit registers and 
translates virtual addresses to an absolute memory address 
using offset registers. The memory management unit loads 
limit and onset registers with the appropriate values from an 
application table to ensure that the executing application 
only accesses the designated memory locations. The 
memory management unit can also partition a homogeneous 
memory device, such as an FERAM memory device, to 
achieve heterogeneous memory characteristics normally 
associated with a plurality of memory technologies, such as 
volatile, non-volatile and program storage (ROM) memory 
segments. Once partitioned, the memory management unit 
enforces the appropriate corresponding memory character- 
istics for each heterogeneous memory type. 

18 Claims, 4 Drawing Sheets 
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MEMORY MANAGEMENT METHOD AND partitioned as ROM for storing the software code for only an 
APPARATUS FOR PARTITIONING application. In addition, the Downs system does not provide 
HOMOGENEOUS MEMORY AND a mechanism for configuring the homogeneous memory to 
RESTRICTING ACCESS OF INSTALLED behave like RAM that provides for the temporary storage of 
APPLICATIONS TO PREDETERMINED 5 information that is cleared after each use.Singlc-chip 
MEMORY RANGES microprocessors, such as those used in smart cards, increas- 
ingly support multiple functions (applications) and must be 
FIELD OF THE INVENTION able to download an application for immediate execution in 
The present invention relates generally to a memory su PP ort of a 8 iven fUnction - Currently, single-chip micro- 
management system for single-chip data processing circuits, 10 Processors prevent an installed application from improperly 
such as a smart card, and more particularly, to a memory corrupting or otherwise accessing the sensitive information 
management method and apparatus that (i) partitions homo- stor ^ d on th , e chm usin S software CODtrols - Software- 
geneous memory devices to achieve heterogeneous memory implemented application access control mechanisms, 
characteristics and (ii) restricts access of installed applica- however, rely on the total integrity of the embedded 
tions to predetermined memory ranges. 15 software > including the software that can be loaded in the 

field. 

BACKGROUND OF THE INVENTION Ideally, a system would allow a third party to create an 
Smart cards typically contain a central processing unit application and load it onto a standard card which removes 
(CPU) or a microprocessor to control all processes and m the control over the integrity of the software allowing 
transactions associated with the smart card. The micropro- 20 malicious attacks. This may be overcome for example, by 
cessor is used to increase the security of the device, by programming an interpreter into the card that indirectly 
providing a flexible method to implement complex and executes a command sequence (as opposed to the micro- 
variable algorithms that ensure integrity and access to data Processor executing a binary directly). This technique, 
stored in non volatile memory. To enable this requirement, „ however ' re 1 u ! res more Pressing power for a given func- 
smart cards contain non-volatile memory, for storing pro- 25 tl0n and additional code on the device which further 
gram code and changed data, and volatile memory for the increases the cost of a cost-sensitive product. A mechanism 
temporary storage of certain information. In conventional Jf *at ensures that every memory transaction made 
smart cards, each memory type has been implemented using b t v a loaded application is limited to the memory areas 
different technologies allocated to it. Furthermore, this mechanism needs to func- 
r» i_i r^™-!/-*** r i • . • ii 30 tion independently of the software such that it cannot be 
Byte erasable EEPROM, for example, is typically used to i. j u i- ■ -n. i* • a 
, J . .. , 4 ■ ' r ' ' r l j • altered by malicious programs. Thus, even malicious soft- 
store non-volatile data, that changes or configures the device ware ^ controlled 
in the field, while Masked-Rom and more recently one-time- 
programmable read-only memory (OTPROM) is typically A furlher need exists for a hardware-implemented access 
used to store program code. The data and program code 35 conlro1 mechanism that prevents unauthorized applications 
stored in such non-volatile memory will remain in memory, from accessing stored information, such as sensitive data, 
even when the power is removed from the smart card. and {h * controlling software of smart cards. Hardware- 
Volatile memory is normally implemented as random access implementations of an access control mechanism will maxi- 
memory (RAM). The hardware technologies associated with mize the security of the single-chip microprocessor, and 
each memory type provide desirable security benefits. For 40 allow c° de t0 be reused ' b V isolating the code from the actual 
example, the one-time nature of OTPROM prevents autho- hardware implementation of the device. Furthermore, a 
rized program code from being modified or over-written hardware- implemented access control mechanism allows a 
with unauthorized program code. Likewise, the implemen- secure kernel (operating system) to be embedded into the 
tation of volatile memory as RAM ensures that the tempo- dev | cc » having access rights to features of the device that are 
rarily stored information, such as an encryption key, is 45 denied t0 applications. 

cleared after each use. SUMMARY OF THE INVENTION 

There is an increasing trend, however, to utilize homo- 
geneous memory devices, such as ferroelectric random Generally, a memory management unit is disclosed for a 
access memory (FERAM), in the fabrication of smart cards. single-chip data processing circuit, such as a smart card. The 
FERAM is a nonvolatile memory employing a ferroelectric 50 memory management unit (i) partitions a homogeneous 
material to store the information based on the polarization memory device to achieve heterogeneous memory charac- 
state of the ferroelectric material. Such homogeneous teristics for various regions of the memory device, and (ii) 
memory devices are desirable since they are non-volatile, restricts access of installed applications executing in the 
while providing the speed of RAM, and the density of ROM microprocessor core to predetermined memory ranges, 
while using little energy. The homogeneous nature of such 55 Thus, the memory management unit imposes firewalls 
memory devices, however, eliminates the security benefits between applications and permits hardware checked parti- 
that were previously provided by the various hardware tioning of the memory. 

technologies themselves. Thus, a need exists for the ability xh c memory management unit provides two operating 

to partition such otherwise homogeneous memory devices modes for the processing circuit. In a secure kernel mode, 

into volatile, non-volatile and program storage (ROM) 60 the programmer can access all resources of the device 

regions with the appropriate corresponding memory char- including hardware control. In an application mode, the 

acteristics. memory management unit translates the virtual memory 

U.S. Pat. No. 5,890,199 to Downs discloses a system for address used by the software creator into the physical 

selectively configuring a homogeneous memory, such as address allocated to the application by the operating system 

FERAM, as read/write memory, read only memory (ROM) 65 in a secure kernel mode during installation. The present 

or a combination of the foregoing. Generally, the Downs invention also ensures that an application does not access 

system allows a single portion of the memory array to be memory outside of the memory mapped to the application 
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by the software when in secure kernel mode. Any illegal person of ordinary skill. In addition, while the present 

memory accesses attempted by an application will cause a invention is illustrated in a smart card environment, the 

trap, and in one embodiment, the memory management unit present invention applies to any single-chip data processing 

restarts the microprocessor in a secure kernel mode, option- circuit, as would be apparent to a person of ordinary skill in 

ally setting flags to permit a system programmer to imple- 5 the art. 

ment an appropriate mechanism to deal with the exception. According to a feature of the present invention, the 

An application table records the memory demands of each memory management unit 200, discussed further below in 

application that is installed on the single-chip data process- conjunction with FIG. 2, imposes firewalls between appli- 

ing circuit, such as the volatile, non-volatile and program cations and thereby permits hardware checked partitioning 

storage (OTPROM) memory requirements of each applica- 30 of the memory. Thus, an application has limited access to 

tion. The memory management unit implements memory only a predetermined memory range. As discussed further 

address checking using limit registers and translates virtual below, the memory management unit 200 performs memory 

addresses to an absolute memory address using offset reg- address checking and translates addresses based on user- 

isters. Once the appropriate memory areas have been alio- specified criteria. 

cated to each application program, the memory management 35 According to another feature of the invention, the 

unit loads limit and offset registers with the appropriate memory management unit 200 provides two operating 

values from the application table to ensure that the executing moc ] es f or t h e microprocessor 110. In a secure kernel mode, 

application only accesses the designated memory locations. t ne programmer can access all resources of the device 

According to another aspect of the invention, the memory including hardware control. In an application mode, the 

management unit partitions a homogeneous memory device, 20 memory management unit 200 translates the virtual memory 

such as an FERAM memory device, to achieve heteroge- address used by the software creator into the physical 

neous memory characteristics normally associated with a address allocated to the application by the operating system 

plurality of memory technologies, such as volatile, non- in a secure kernel mode during installation. The present 

volatile and program storage (ROM) memory segments. invention also ensures that an application does not access 

Once partitioned, the memory management unit enforces the 25 memory outside of the memory mapped to the application 

appropriate corresponding memory characteristics for each by the software when in secure kernel mode. Any illegal 

heterogeneous memory type. A memory partition control memory accesses attempted by an application will cause a 

logic is programmed with the required partitioning associ- trap, and in one embodiment, the memory management unit 

ated with each portion of the homogeneous memory in order 200 restarts the microprocessor 10 in a secure kernel mode, 

that the homogeneous memory behaves like volatile, non- 30 optionally setting flags to permit a system programmer to 

volatile and program storage (OTPROM) memory implement an appropriate mechanism to deal with the 

technologies, as desired. exception. 

A more complete understanding of the present invention, In this manner, an exception is identified if an application 
as well as further features and advantages of the present 35 is written with the accidental or specific intention of corn- 
invention, will be obtained by reference to the following promising the security of the smart card, by accessing stored 
detailed description and drawings. data, code or by manipulating the hardware to indirectly 

influence the operation of the chip. The memory manage - 

BRIEF DESCRIPTION OF THE DRAWINGS me nt unit 200 limits the application to the allocated program 

* . , . * < i i j • 11 * «• i An code and data areas. Any other references result in termi- 

HG. 1 is a schematic block diagram illustrating a single- 40 . J in 

• . j * * • u , , ° u , nation of the application and flagging the secure kernel that 

chip data processing circuit, such as a smart card, that t /« . . j r~ 

f . 4 . +u such an illegal attempt has been made. Thus, each apphca- 

mcludes a memory management unit in accordance with the . . t & , - , ■ 

resent invention* tl0n 1S 1S0 ^ ate ° ^ om a ^ otner applications, the hardware and 

' . . the secure kernel. In an implementation where application 

FIG. 2 is a schematic block diagram of an exemplary ^kHoii is not neededj the security mechanism acts as a 

hardware-implementation of the memory management unit 45 prolection unh trapping cmrs 

According to a further feature of the present invention, the 

FIG. 3 is a sample table from the exemplary application memory management umt 2 00 partitions a homogeneous 

table of MG. 2; and memory device, such as an FERAM memory device, to 

FIG. 4 is a schematic block diagram illustrating the 5Q achieve heterogeneous memory characteristics normally 

memory partition control logic of FIG. 2. associated with a plurality of memory technologies, such as 

volatile, non-volatile and program storage (ROM) memory 
DETAILED DESCRIPTION segments. Once partitioned, the memory management unit 
FIG. 1 illustrates a single-chip data processing circuit 100, 200 enforces the appropriate corresponding memory char- 
such as a smart card, that includes a microprocessor core 55 acteristics for each heterogeneous memory type. 
110, memory devices 120, 130 and a memory management FIG. 2 provides a schematic block diagram of an exem- 
unit 200 that interfaces between the microprocessor core 110 plary hardware-implementation of the memory management 
and the memory devices 120, 130 for memory access unit 200. As previously indicated, the memory management 
operations. In accordance with the present invention, the unit 200 (i) partitions a homogeneous memory device to 
memory management unit 200 (i) partitions a homogeneous 60 achieve heterogeneous memory characteristics for various 
memory device to achieve heterogeneous memory charac- regions of the memory device, and (ii) restricts access of 
teristics for various regions of the memory device, and (ii) installed applications executing in the microprocessor core 
restricts access of installed applications executing in the 110 to predetermined memory ranges. As shown in FIG. 2 
microprocessor core 110 to predetermined memory ranges. and discussed further below in conjunction with FIG. 4, the 
It is noted that each of these two features are independent, 65 memory management unit 200 includes a section for 
and may be selectively and separately implemented in the memory partition control logic 400. Generally, the memory 
memory management unit 200, as would be apparent to a partition control logic 400 is programmed with the required 



01/28/2004, EAST version: 1.4.1 



US 6,2' 

5 

partitioning associated with each portion of the homoge- 
neous memory in order that the homogeneous memory 
behaves like volatile, non-volatile and program storage 
(OTPROM) memory technologies, as desired. An applica- 
tion would normally be allocated different memory areas for 
code and data, and the data area can be further divided into 
a volatile portion, for scratch pad operations, and non- 
volatile storage areas. 

In addition, the memory management unit 200 includes an 
application table 300, discussed further below in conjunc- 
tion with FIG. 3. Generally, the application table 300 records 
the memory demands of each application that is installed on 
the single-chip data processing circuit 100. For example, the 
application table 300 indicates the volatile, non-volatile and 
program storage (OTPROM) memory requirements of each 
application. The application table 300 is generated by the 
microprocessor 110 when operating in a secure kernel mode, 
as each application is installed. The kernel allocates the 
appropriate memory areas to each application program. 

Once the appropriate memory areas have been allocated 
to each application program, the memory management unit 
200 shown in FIG. 2 can load the limit and oflket registers 
230-232, 240-242, discussed below, with the appropriate 
values from the application table 300 to ensure that the 
executing application only accesses the designated memory 
locations. Generally, the memory management unit 200 
implements memory address checking using the limit reg- 
isters 230-232 and translates addresses to an absolute 
memory address using the offset registers 240-242. 

In addition to restricting access of installed applications 
executing in the microprocessor core 110 to predetermined 
memory ranges, the memory management unit 200 also 
translates addresses between the virtual memory address 
used by the software programmer into the physical address 
allocated to the application by the operating system in a 
secure kernel mode, before it hands over execution to the 
application code. It is noted that when programming the 
illustrative 8051 microprocessor, a software programmer 
starts with a code space starting at an address of 0, and a data 
space starting at an address of 0. Furthermore, the size of the 
code and data space is a variable corresponding to the 
required resource of a given application. 

Again, the application has the appropriate volatile, non- 
volatile and program storage (OTPROM) memory alloca- 
tions that are translated and checked by the memory man- 
agement unit 200, in a manner described below, such that 
attempts to access memory outside the designated memory 
area will result in the application being terminated. The 
kernel will be restarted and the offending trapped access, 
being stored for interrogation by the kernel. 

The hardware memory-mapping scheme and out of area 
protection hardware mechanism is shown in FIG. 2. In the 
illustrative 8051 microprocessor, only one application is 
active at any time, so only one set of mapping logic is 
required, as shown in FIG. 2. Thus, the microprocessor core 
110 must implement context switching in a multi- function 
environment, as would be apparent to a person of ordinary 
skill. As previously indicated, the memory management unit 
200 includes a pair of limit and offset registers, such as the 
registers 230-232, 240-242, respectively, for each memory 
technology that is managed by the memory management 
unit 200. 

Before an application is started, the associated memory 
requirements are retrieved from the application table 300 by 
the secure operating system running in the kernel mode. The 
associated memory requirements are loaded into the corre- 
sponding limit and offset registers 230-232, 240-242. 

Thereafter, the kernel loads the code application offset 
register (COR) 240 with the address of where the application 
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program code is stored in memory. The kernel then loads the 
code application limit register (CLR) 230 with the size of the 
application code space. Similarly, the data space can be 
defined as a block of memory, whose size is the sum of the 

5 sizes of both the volatile and non-volatile memory, allocated 
to that application. Thus, the kernel loads the data limit 
register (DLR) 231 with the size of the data space (both the 
volatile and nonvolatile memory). The size of the allocated 
volatile memory is loaded into the volatile data limit register 
(VDLR) 232, and the base address to be used for the scratch 

10 pad memory (RAM) is loaded into the volatile data offset 
register (VDOR) 241. Finally, the base address to be used for 
non- volatile storage (EEPROM) allocated to the application 
is loaded into the non volatile offset register (NVOR) 242. 
In one implementation, the memory protection mecha- 

15 nism checks the virtual memory addresses assigned by the 
programmer, as opposed to the absolute addresses allocated 
by the kernel. Thus, the illegal access mechanism is 
simplified, as an illegal memory access is identified when an 
access is made to a location having a virtual address that is 

20 greater than the value contained in the appropriate limit 
register. Thus, as shown in FIG. 2, the memory management 
unit 200 contains comparators 250, 255 for comparing the 
virtual address issued by the microprocessor core 210, to the 
value contained in the appropriate limit register 230-232. If 

25 the application is attempting an unauthorized memory 
access, the corresponding comparator 250, 255 will set an 
out-of-bounds trap. 

If the application is attempting an authorized memory 
access, the corresponding comparator 250, 255 will enable 

30 the appropriate offset register 240-242, and the value from 
the offset register will be added by an adder 260 to the virtual 
address issued by the microprocessor core 210. In one 
preferred implementation, the limit and offset registers 
230-232, 240-242 and the comparators 250, 255 are fabri- 
cated using known tamper-resistant technologies to preclude 

35 physical security attack. 

FIG. 3 illustrates an exemplary application table 300 that 
stores information on each application installed on the 
single-chip data processing circuit 100, including the 
memory demands of each installed application. As shown in 

40 FIG. 3, the application table 300 indicates the volatile, 
non-volatile and program storage (OTPROM) memory 
requirements of each application. The application table 300 
may be generated by the microprocessor 110 when operating 
in a secure kernel mode, as each application is installed. The 

45 kernel allocates the appropriate memory areas to each appli- 
cation program. 

The application table 300 maintains a plurality of records, 
such as records 305-315, each associated with a different 
application. For each application identifier in field 320, the 

50 application table 300 includes the base address of where the 
application program code is stored in memory, and the 
corresponding size of the application code space in fields 
325 and 330, respectively. In addition, the application table 
300 indicates the total size of the data space in field 335 (sum 

55 of both the volatile and non-volatile memory), with the size 
of the allocated volatile memory stored in field 340, the base 
address for the scratch pad memory (RAM) in field 345, and 
the base address for non-volatile storage (EEPROM) is 
recorded in field 350. As previously indicated, when an 
application becomes active, each of the corresponding 

60 memory range values from fields 325 through 350 are 
retrieved and loaded into the appropriate limit and offset 
registers 230-232, 240-242, respectively, 

FIG. 4 illustrates the memory partition control logic 400 
for a homogeneous memory array 450. As previously 

65 indicated, the memory partition control logic 400 contains 
registers associated with each portion of the homogeneous 
memory in order that the homogeneous memory behaves 
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like volatile, non- volatile and program storage (OTPROM) 2. The single-chip data processing circuit of claim 1, 

memory technologies, as desired. An application would wherein said memory technologies include a read only 

normally be allocated different memory areas for code and memory technology with limited programmability. 

data, and the data area can be further divided into a volatile 3. The single-chip data processing circuit of claim 1, 

portion, for scratchpad operations, and non-volatile storage 5 wherein said memory technologies include a non-volatile 

areas. FERAM is inherently a non-volatile array. In other memory technology. 

words, FERAM can be changed many times and holds the 4. The single-chip data processing circuit of claim 1, 

last written value, even when powered down, in a manner wherein said memory management unit includes block erase 

similar to EEPROM. Thus, it is unnecessary to force logic to achieve volatile memory characteristics. 

EEPROM-behavior onto the FERAM to achieve a non- 5. The single-chip data processing circuit of claim 1, 

volatile array, wherein said memory management unit includes lock-write 

To create a volatile array using the non-volatile FERAM erase logic to achieve memory characteristics with limited 

array, erase circuitry 410, 430 is added, for example, by programmability. 

writing O's to each address, or using a block erase feature 6. The single-chip data processing circuit of claim 1, 

built into the array that writes O's to many addresses in wherein said memory management unit further comprises a 

parallel. The erase circuitry 410, 430 records the upper and 15 register for storing a base memory address corresponding to 

lower limits of the memory range that should behave like a a location where a non-volatile memory region begins, 

volatile array. Similarly, to ensure that the code is not written 7- The single-chip data processing circuit of claim 1, 

to, a write inhibit has to be forced onto the memory array wherein said memory management unit further comprises a 

using lock-write circuitry 420, 440. The lock-write circuitry register for storing a memory address corresponding to a 

420, 440 records the upper and lower limits of the memory 20 location where a non-volatile memory region ends, 

range that should behave like program storage (OTPROM) 8- The single-chip data processing circuit of claim 1, 

memory. wherein said memory management unit further comprises a 

Once the application space has been setup by the secure register for storing a base memory address corresponding to 

kernel, defined areas of the homogeneous array need to a location where a volatile memory region begins, 

behave in the appropriate manner. This can be achieved by 25 9 - ^ single-chip data processing circuit of claim 1, 

mapping the erase logic using the same memory definitions wherein said memory management unit further comprises a 

used to define the volatile memory area for applications. register for storing a memory address corresponding to a 

Before an application is started (or after or both), the erase location where a volatile memory region ends, 

mechanism is enabled, ensuring that an application when 10. A method for partitioning a homogeneous memory 

started can see no residual values left over by a previous 30 device to achieve heterogeneous memory characteristics for 

application or the kernel, that may have used the designated various regions of the memory device for a plurality of 

block. Similarly, the same simple mechanism can be used to applications, comprising the steps of: 

enforce a write-lock on the area designated as the code space partitioning said homogeneous memory device to achieve 

for the application to prevent the application from modifying memory characteristics associated with a plurality of 

its code to cause potential unknown conditions and hence 35 memory technologies, including a volatile memory 

revealing secure aspects of the device. technology* 

The application RAM area is defined by parameters recording for' each of said applications a range for an 

loaded into erase circuitry 430. Typically, the value loaded u„,„ *~ 

^ n t J .\ / , . , , , assigned heterogeneous memory type corresponding to 

into the erase circuitry 430 would be the physical address each of said partitions* and 

location within the FERAM memory array and the size of „ . F , ' . . 

the allocated memory. The block erase logic 410, when 40 enforcing memory characteristics for a heterogeneous 

activated, is constrained by the erase circuitry 430 to erase memory type corresponding to each of said partitions 

the predefined area. The same principle is used to obtain for each of said applications. 

OTP characteristics. OTP partitioning is defined by the U. The method of claim 10, wherein said memory tech- 
lock-write circuitry 440, which allocates an area of the same nologics include a read only memory technology with 
memory array once parameters are loaded. The lock write 45 limited programmability. 

logic 420 removes the write capability for the area defined 12. The method of claim 10, wherein said memory 

in the lock-write circuitry 440 giving the area the same technologies include a non-volatile memory technology, 

characteristics as OTP memory. 13, The method of claim 10, further comprising the step 

It is to be understood that the embodiments and variations of erasing a partition of said homogeneous memory device 

shown and described herein are merely illustrative of the 50 to achieve volatile memory characteristics, 

principles of this invention and that various modifications 14. The method of claim 10, further comprising the step 

may be implemented by those skilled in the art without 0 f preventing write operations in a partition to achieve 

departing from the scope and spirit of the invention. memory characteristics with limited programmability. 

I claim: 15 fh G method of claim 10, further comprising the step 

1. a single-chip data processing circuit, comprising: 55 of storing a base memory addre ss corresponding to a loca- 

a processor for executing a plurality of applications; tion where a non-volatile memory region begins, 

a homogeneous memory device; and 16. The method of claim 10, further comprising the step 

a memory management unit for (i) partitioning said of storing a memory address corresponding to a location 

homogeneous memory device for each of said plurality where a non-volatile memory region ends, 

of applications to achieve memory characteristics asso- 60 17. The method of claim 10, further comprising the step 

ciated with a plurality of memory technologies, includ- of storing a base memory address corresponding to a loca- 

ing a volatile memory technology, (ii) recording for tion where a volatile memory region begins, 

each of said applications a range for an assigned 18. The method of claim 10, further comprising the step 

heterogeneous memory type corresponding to each of of storing a memory address corresponding to a location 

said partitions, and (iii) enforcing memory character- 65 where a volatile memory region ends, 
istics for a heterogeneous memory type corresponding 

to each of said partitions for each of said applications. ***** 
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